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Abstract 

The fundamental and natural connection between the infinite constellation (IC) dimension and the best diversity order it can 
achieve is investigated in this paper. In the first part of this work we develop an upper bound on the diversity order of IC's 
for any dimension and any number of transmit and receive antennas. By choosing the right dimensions, we prove in the second 
- - - part of this work that IC's in general and lattices in particular can achieve the optimal diversity-multiplexing tradeoff of finite 

CO ' constellations. This work gives a framework for designing lattices for multiple-antenna channels using lattice decoding. 

o : 

I. Introduction 

i-C ■ The use of multiple antennas in wireless communication has certain inherent advantages. On one hand, using multiple 
rS^ ' antennas in fading channels allows to increase the transmitted signal reliability, i.e. diversity. For instance, diversity can be 
attained by transmitting the same information on different paths between transmitting-receiving antenna pairs with i.i.d Rayleigh 
fading distribution. The number of independent paths used is the diversity order of the transmitted scheme. On the other hand, 
the use of multiple antennas increases the number of degrees of freedom available by the channel. In [|jJ,[|2J the ergodic channel 
f-H capacity was obtained for multiple-input multiple-output (MIMO) systems with M transmit and N receive antennas, where 
the paths have i.i.d Rayleigh fading distribution. It was shown that for large signal to noise ratios (SNR), the capacity behaves 
^ as C(SNR) « min(Af, A^) log(SNR). The multiplexing gain is the number of degrees of freedom utilized by the transmitted 
scheme. 

For the quasi-static Rayleigh flat-fading channel, Zheng and Tse |I3] characterized the dependence between the diversity order 
and the multiplexing gain, by deriving the optimal tradeoff between diversity and multiplexing, i.e. for each multiplexing gain 
the maximal diversity order was found. They showed that the optimal diversity-multiplexing tradeoff (DMT) can be attained 
• by ensemble of i.i.d Gaussian codes, given that the block length is greater or equal to N + M ~ 1. For this case, the tradeoff 
^\ curve takes the form of the piecewise linear function that connects the points (TV — 1){M — I), 1 = 0,1,..., min(Af, N). 
04 Space-time codes are coding schemes designed for MIMO systems e.g. see H,|l5l 161 and references therein. The design 
of space-time codes in these works pursue various goals such as maximizing the diversity order, maximizing the multiplexing 
gain, or achieving the optimal DMT. El Gamal et al [7| were the first to show that lattice coding and decoding achieve the 
optimal DMT. They presented lattice space-time (LAST) codes. These space time codes are subsets of an infinite lattice, where 
I the lattice dimensionality equals to the number of degrees of freedom available by the channel, i.e. min(M, N), multiplied by 
J> ' the number of channel uses. By using a random ensemble of nested lattices, common randomness, minimum mean square error 
(MMSE) estimation followed by lattice decoding and modulo lattice operation, they showed that LAST codes can achieve the 
optimal DMT. It is worth mentioning that the MMSE estimation and the modulo operation take in a certain sense into account 
I the finite code book. 

There has been an extensive research on explicit coding schemes, based on lattices, which are DMT optimal. Such an explicit 
coding schemes which are DMT optimal for any number of transmit and receive antennas were presented in ||6]. In addition 
it was shown in |6| that Af channel uses are sufficient to obtain the optimal DMT. However, the DMT optimality in |6| was 
established using maximum-likelihood (ML) decoding which may be infeasible due to its computational complexity. Another 
step towards finding explicit space-time coding schemes that attain the optimal DMT with low computational complexity was 
made by Jalden and Elia |8l. They considered explicit coding schemes based on the intersection between an underlying lattice 
and a shaping region. They showed that for the cases where this coding schemes attain the optimal DMT using ML decoding, 
they also attain it when using MMSE estimation in the receiver, followed by lattice decoding. The MMSE estimation relies 
on the power constraint, i.e. the shaping region boundaries. In addition, it was shown in (W\ that by applying lattice reduction 
methods, the optimal DMT is attained when using suboptimal linear lattice decoders that require linear complexity as a function 
of the rate. This result applies to wide range of explicit space-time codes such as golden-codes ||9l, perfect space-time codes 
|[Tp1 and in general cyclic division algebra based space-time codes |6|, and as this codes are approximately universal [TT| 
it also applies to every statistical characterization of the fading channel. Note that these schemes take into consideration the 
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finiteness of the codebook in the decoder. In our work we consider for lattices regular lattice decoding, i.e. decoding over the 
infinite lattice without taking into consideration the finite codebook. 

The work in [7J also includes a lower bound on the diversity order, for the case N > M, for LAST codes shaped into a 
sphere with regular lattice decoding. For sufficiently large block length they showed that d{r) > {N — M + 1){M — r) where r 
is the multiplexing gain and the lattice dimension per channel use is M. Taherzadeh and Khandani showed in |12| that this is 
also an upper bound on the diversity order of any LAST code shaped into a sphere and decoded with regular lattice decoding. 
These results show that LAST codes together with regular lattice decoding are suboptimal compared to the optimal DMT of 
power constrained constellations. 

Infinite constellations (IC's) are structures in the Euclidean space that have no power constraint. In ifTJl . Poltyrev analyzed 
the performance of IC's over the additive white Gaussian noise (AWGN) channel. In this work we first extend the definitions of 
diversity order and multiplexing gain to the case where there is no power constraint. We also introduce a new term: the average 
number of dimensions per channel use, which is essentially the IC dimension divided by the number of channel uses. Then 
we extend the methods used in llT3l in order to derive an upper bound on the diversity of any IC with certain average number 
of dimensions per channel use, as a function of the multiplexing gain. It turns out that for a given number of dimensions 
per channel use, the diversity is a straight line as a function of the multiplexing gain that depends on the number of transmit 
and receive antennas. This analysis holds for any M and N, and also applies for lattices with regular lattice decoding. We 
also find the average number of dimensions per channel use for which the upper bounds coincide with the optimal DMT of 
finite constellations. Finally, we show that for the aforementioned average number of dimensions per channel use, together 
with sufficient amount of channel uses, there exist sequences of lattices that attain different segments of the optimal DMT 
with regular lattice decoding, i.e. for each point in the DMT of fS\ there exists a lattice sequence of certain dimension that 
achieves it with regular lattice decoding. Hence, this work characterizes the best DMT IC's may attain for any average number 
of dimensions per channel use, and proves that lattices can achieve the optimal DMT also by using regular lattice decoding, 
by adapting their dimensionality. 

This work gives a framework for designing lattices for multiple-antenna channels using regular lattice decoding. It also 
shows the fundamental and natural connection between the IC dimension and its optimal diversity order. For instance, it is 
shown that for the case M = N = 2, the maximal diversity order of 4 can be achieved (with regular lattice decoding) by a 
lattice that has at most | average number of dimensions per channel use. On the other hand the Alamouti scheme |14|, that 
also has maximal diversity order of 4, utilizes only a single dimension per channel use in this set up. Hence, there is still a 
room to improve by a -i of a dimension per channel use. In addition, while in [71, fSl, the MMSE estimation improves the 
channel in such a manner that enables the lattice decoder to attain the optimal DMT, this work shows that when considering 
regular lattice decoding, reducing the lattice dimensionality takes the role of MMSE estimation in the sense of improving the 
channel such that the optimal DMT is obtained. Finally, the analysis in this work gives another geometrical interpretation for 
the optimal DMT. 

The outline of the paper is as follows. In section HI] basic definitions for the fading channel and IC's are given. Section Hill 
presents a lower bound on the average decoding error probability of IC's for any channel realization, and an upper bound on 
the diversity order An upper bound on the error probability for each channel realization, a transmission scheme that attains 
the best diversity order and some averaging arguments regarding the achievable diversity order of IC's, are all presented in 
section |IV] Discussion on the difference between lattice constellations, and lattice based finite constellations is presented in 
section |V] 



II. Basic Definitions 

We refer to the countable set S = {si, S2, ■ ■ ■} in C" as infinite constellation (IC). Let cube;(a) C C" be a (probably 
rotated) /-complex dimensional cube (/ < n) with edge of length a centered around zero. An IC Si is /-complex dimensional 
if there exists rotated /-complex dimensional cube cube; (a) such that Si C lima_^oo cubei(a) and / is minimal. M{Si,a) = 
\Si Pi cube; (a) I is the number of points of the IC Si inside cube; (a). In ifTSl . the n-complex dimensional IC density for the 
AWGN channel was defined as the upper limit (the limit supremum) of the ratio 7g = limsupjj_^o^ ^^^f^""* and the volume 

to noise ratio (VNR) was given as fiQ — ■ 

The Voronoi region of a point x E Si, denoted as V{x), is the set of points in liuia^oD cube; (a) closer to x than to any 
other point in the IC. The effective radius of the point x E Si, denoted as rcs{x), is the radius of the /-complex dimensional 
ball that has the same volume as the Voronoi region, i.e. r^six) satisfies 

|r(.)l = ^. O) 

We consider a quasi static flat-fading channel with M transmit and N receive antennas. We assume for this MIMO channel 
perfect channel knowledge at the receiver and no channel knowledge at the transmitter The channel model is as follows: 

y^^H-x^ + p-h_i^ i = l,...,r (2) 
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where x^, t = 1, . . . ,T is the transmitted signal, ~ CN{0, sfe'^^) '■^^ additive noise where CN denotes complex- 
normal, /jv is the iV-dimensional unit matrix, and y £ . H is the fading matrix with N rows and M columns where 
hi J ^ CN{0, 1), 1 < j < -/V, 1 < J < M, and p^' is a scalar that multiplies each element of n^, where p plays the role of 
average SNR in the receive antenna for power constrained constellations that satisfy ^ X]t=i ^{li^ItlPl — 

We also define the extended vector x = {^j, . . . ,x}rp}^. Suppose x £ Si C C^^^, where Si is an IC with density jtr — 
limsuP(j^o^ ^^^f','"-* (a^ ' is the volume of cubei{a)). By defining Hex as an NT x MT block diagonal matrix, where each 
block on the diagonal equals H, n^^ — p^^ ■ . . . ,2i^}^ S and y e we can rewrite the channel model in ^ 



— cx 

as 



y = -ffox • £ + ?iox- (3) 



^x 



In the sequel we use L to denote min{M, N). We define as a/Ai, I < i < L the real valued, non-negative singular values 
of H. We assume VXl > ■ • • > "\/Ai > 0. Our analysis is done for large values of p (large VNR at the transmitter). We state 
that f{p)>g{p) when limp^oo ^ — '"ffp^^ , and also define <, = in a similar manner by substituting < with >, ~ 

respectively. 

We now turn to the IC definitions in the transmitter We define the average number of dimensions per channel use as the 
IC dimension divided by the number of channel uses. We denote the average number of dimensions per channel use by K. 
Let us consider a XT-complex dimensional sequence of IC's Skt{p), where K < L, and T is the number of channel uses. 
First we define jtr = p^'^ as the density of Skt{p) in the transmitter The IC multiplexing gain is defined as 

^^og htr + 1) lim i 

Note that MG{r) = maa;(0, r), i.e. for Q < r < K the multiplexing gain is r. Roughly speaking, ^tr = P^'^ gives us the 
number of points of Skt{p) within the XT-complex dimensional region cuheKri^)- In order to get the multiplexing gain, we 
normalize the exponent of the number of points within cubeKri^), rT, by the number of channel uses - T. Note that the IC 
multiplexing gain, r, can be directly translated to finite constellation multiplexing gain r by considering the IC points within 
a shaping region. For more details see IV-CI The VNR in the transmitter is 



MG(r) = lim - log^ (7^, + 1 ) ^ lim - log^ {p^^ + 1 ) . (4) 



KT 



where a = is each dimension noise variance. Now we can understand the role of the multiplexing gain for IC's. The 
AWGN variance decreases as p~^, where the IC density increases as p''^. When r we get constant IC density as a 
function of p, where the noise variance decreases, i.e. we get the best error exponent. In this case the number of points within 
cubeKxi^) remains constant as a function of p. On the other hand, when r ~ K, we get VNR /ifr ~ 1, and from |13| we 
know that it inflicts average error probability that is bounded away from zero. In this case, the increase in the number of IC 
points within cubeKri^) occurs at maximal rate. 

Now we turn to the IC definitions in the receiver First we define the set Hex ■ cubeKT{a) as the multiplication of each 
point in cubeKT{o) with the matrix Hex- In a similar manner Sj^j, — Hex ■ Skt- The set Hex ■ cubeKrio) is almost surely 
ifT-complex dimensional (where K < L) and in this case M {Skt, a) — \Skt C\ cubexT(a)| = ISxrCliHex ■ cubexT(a))|- 
We define the receiver density as 

M{SKT,a) 

7rc = hmsup --- 

a^oo Vol(HexCnheKT[a)) 

i.e., the upper limit of the ratio of the number of IC points in Hex-CuheKria), and the volume of Hex CuheKria)- Based 
on the majorization property of a matrix singular values [15|, we get that the volume of the set Hex ■ cubeKxia) is smaller 
than a2^^.AL..ALs+i-A^^ assuming K — B + where i? e N and < /3 < 1, i.e. the volume is smaller than the 
multiplication of the B + 1 strongest singular values, raised to the power of the maximal amount of channel uses each can 
take place in. Hence we get 

7rc>p'-^A^^...AZ^5+i.A-^^ (6) 

and the receiver VNR is 

firo<p'-^ -Xt ■■■Xt-B + l-^-B- (7) 

Note that for N > M and K = M we get — p^^ ■ Ilfii and /ijc — p^^^ ■ Ilfli X " ■ The average decoding error 
probability over the IC points of Skt{p), for a certain channel realization H, is defined as 

Pe {H, p) = lim sup -— ^ (8) 

a^-oo M\PkT-,0-) 

where Pe(i£ iH^p) is the error probability associated with x_ . The average decoding error probability of Skt{p) over all 
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channel realizations is Pe{p) = EniPeiH, p)}. Hence the diversity order equals 

d = - \im \ogp(Pe{p)) (9) 

III. Upper Bound on the Diversity Order 

In this section we derive an upper bound on the diversity order of any IC with average number of dimensions per channel 
use K and any value of T, M and N . We begin by deriving a lower bound on the average decoding error probability of 
Skt{p) for each channel realization. As in [3] and [7|, we also define A; = I < i < L. When the entries of the channel 

matrix H are all i.i.d with PDF CN{0, 1), the PDF of its singular values is of the form p~ 5i;i=i(|JV-A/|+2i-i)Q!i ^^j. j^gg p 
||3l . where following the definitions above < < • • • < ai. Q By assigning in ^ respectively, we can write 

li-c ^ P 

and 

Theorem 1. For any KT -complex dimensional IC Skt{p) with transmitter density 7tr — p^^ and channel realization 
a — (ai, we have the following lower bound on the average decoding error probability for < r < K 

P^{H,p) > ^(■^^) ^-^rc-A(KT) + (KT-l) In(Mrc) 

where A{KT) = e • T{KT + 1)^ and C{KT) = ""^'^l^riKr]'^^^ ■ 

Proof: We divide the proof into two parts. In the first part we prove the result for lattices, that constitute a symmetric 
structure for which the Voronoi regions of different lattice points are identical. In the second part we prove the result for 
general IC's with receiver density •jrc- As the second part of the proof is somewhat more involved, we defer it to appendix lAl 
Note that we could have used the tighter bounds of |17J, but these bounds are not needed for DMT. Instead we derive coarser 
and more simplified upper bounds, which are sufficient for our purposes. 

We begin by proving the result for lattices. Lattices constitute a discrete subgroup of the Euclidean space, with the ordinary 
vector addition operation. Consider a AT-complex dimensional lattice, S^rp{p), in the receiver with density 7rc. The lattice 
points have identical Voronoi regions up to a translation. Hence, the volume of each Voronoi region equals 

\V{x)\ = — yxeS'^^ip). 

Ire 

According to the definition of the effective radius in ([T]i, we get that r^six) — rcg{jrc) = {^-^^^-§wr-)^^ , Vx e Sj^rp(p). 
Note that in lattices the maximum-likelihood (ML) decoding error probability is identical for all lattice points, i.e. the average 
and maximal error probabilities are identical. It has been proven in ifTsl . ifTsl that the error probability of any lattice point in 
the receiver fulfils 

Pf-- > PriW^J > reff(7rc)) 

where Pg is the ML decoding error probability of any lattice point, and n^^ is the effective noise in the iiTT-complex 
dimensional hyperplane where Sj^rp[p) resides. We find an explicit expression for the lower bound 

Pr (11^x11 > ^cff(7rc)) > Pr (lln^Jl > r,s{^)) > J^^ a^KT2KT^KT) '^'' ^ a^KT%KTY(^KT)^ - ^^^^ 



-^A{KT) + (KT-l)\n{2n 



By assignmg r^^j = {—±—j^)kt we get 

Pe"'^ > C{KT) ■ e' 
1^ 

and by assigning prc = we get 

pS'KT > C{KT) ^ ^_^^^A(KT) + (KT-l)\r,{p.^,) ^ ^jj^ 

Note that in ( fTOl i we lower bounded the error probability with Tcff (^) instead of Tcff (7, c), and also in (fTTI ) we multiplied by j, 
in order to be consistent with the general lower bound for IC's shown in appendix lAl For lattices we have Pe{H, p) — pf'^'^. 



A generalization of the Rayleigh fading channel is the Jacobi fading channel. The optimal DMT for this channel was derived in 1161 . 
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Essentially what we have shown here is a scaled sphere packing bound0 ■ 
Next, we would like to use this lower bound to average over the channel realizations and get an upper bound on the diversity 
order 

Theorem 2. The diversity order of any KT -complex dimensional sequence of IC's Skt{p), with K average number of 
dimensions per channel use, is upper bounded by 

dKT{r)<d*j,{r)^M-N{l~^) 

forO<K< jjfj^, and 

dKT{r) < d*Ar) - {M 1){N 0;^(1 - ^) 

for j^l^]^i^2(i~i) ^ ^ ^ ^ <- K < l!l^]\2-^-24 ^ '^^^ I = 1, . . . , L — 1. In all of these cases < r < K. 

Proof: For any IC with VNR /irc, assigning /i^^ > ^rc in the lower bound from Theorem [T] also gives a lower bound on 
the error probability 



C{KT) ^j^[^.A(KT) + {KT~l) ln(M;.J 



4 

It results from the fact that inflating the IC into an IC with VNR ji,^^ must decrease the error probability, where 

C{KT) ^,[^.A(KT) + (KT-l) ln(A<;j 
4 

is a lower bound on the error probability of any IC with VNR fj.^.^. Hence, for the case ^rc < 1 we can lower bound the error 
probability by assigning 1 in the lower bound and get £iEHg-~MKT) ^ j ^ ^ ^ jjjg average decoding error probability 

is bounded away from for any value of p. We can give the event fire < 1 the interpretation of an outage event. 

We would like to set a lower bound for the error probability for each channel realization a, which we denote by Pe^{p,a). 
We know that p.^^ < ^^^"^^''+^^=0 "i-i+'^^i-s). For the case X]t=o^ ctL-i + Pot^^B < K — r, we take 



^ C{KT) ^-L{p^a)-A(KT) + (KT-l) ln(L(p,a) 



where L{p, a) = p^ k ('"+II!i=o "t-i+^^i-B) > i por the case J2f=o^ ctL-i + PotL-B > K — r we get that fire < 1. and we 
take 



jLB/ ^ _ C{KT) A(KT) 



4 

In order to find an upper bound on the diversity order, we would hke to average P^^{p,a) over the channel reaUzations. 
In our analysis we consider large values of p, and so we calculate 

P'e{p)> f F^^(p,a).p-5:ti(l^~^'^l+2-i)".da (12) 

Ja>0 

where a > signifies the fact that ai > • • • > > 0. By defining A = {a\ J2f=o^ ctL-i + PctL~B < K — r;a> 0} and 
A = {a\ J2f=o^ ctL-i + PctL-B > K — r;a> 0} we can spHt (fT2] | into 2 terms 

JaeA JaeA 

Hence 

JaeA 

In a similar manner to lH], Q, for very large p, we approximate the average value by finding the most dominant exponential 
term in the integral. For this we would like to find the minimal value of 



lim ~\ogJP,^^{p,a) ■ p->:r..(l^-M|+2.-i)o.) 

p—>-oo ^ 



for the case a<E A. For a e yl, we get that P^^{p,a) is bounded away from for any value of p. Hence, in order to find 
the most dominant error event we would like to find min^ X]f=i(l^ ^ ^^1 + 2j — l)cti given that aE A. The minimal value 



Note that while Theorem [T] refers to i^TT-complex dimensional IC's, the lower bound derived in this theorem applies for any 2KT-rea\ dimensional IC. 
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is achieved at the boundary, i.e. for a satisfying X]^o^ 



CtL-i + 



dKT{r) < min^diV- M| 
=1 



PciL-B = K — r, a> 0. Hence, for any K < L we state that 

f2i-l)ai, 0<r<K (15) 



where Y.i=o^ "i- 



, (3aL-B = K - r and ai > • 
problem whose solution is as follows. For < K < 



> > 0. Basically this optimization problem is a linear programming 
the solution is a; = 1 — i = 1, , 



i.For + 



{AI-l){N-l) 



■ I and 1 = 1,. 



The 



N+M-i-2-i ' ' - - ^, • ■ • , i - 1 the solution is ol = • • • = "L-f+i = and aL-i = ■ ■ ■ = ai = -^^j 
desired upper is attained by substituting the optimal values of a in ( fTSl ). The detailed solution for the optimization problem is 
presented in appendix 151 ■ 
From Theorem ID we get an upper bound on the diversity order by assuming transmission of the KT complex dimensions 
over the B + 1 strongest singular values. This assumption is equivalent to assuming beamforming which may improve the 
coding gain, but does not increase the diversity order This assumption allows us to derive a lower bound on the average 
decoding error probability. However, we still get maximal diversity order of M N in this case. 

Let us consider as an illustrative example the case of M = N = 2. In this case, for < < I we get d;^(r) = 4(1 - j^). 
For I < ii' < 2 we get d*j^{r) = (1 ^ "^)- I" both cases Q < r < K. For this set up we have two singular values 
and so ai > a2 > 0. The optimization problem is of the form mina>o ai + 3q!2, where for {) < K < 1 the constraint is 
/?a2 = K — r, and for 1 < K < 2 the constraint is a2 + Po-i = K — r. For the case < i^T < | the optimization problem 
solution is ai = a2 = I — i.e. in this case the most dominant error event occurs when both singular values are very small. 
For the case A' = | the constraint is of the form Q!2 + ^ = | — r, and the optimization problem solution is achieved for 
both ai = a2 = 1 — ^ and a2 = 0, ai 



= 4 — 3r. For the case ^ < K < 2 the optimization problem solution is achieved for 



a2 = 0, ai = ^3x> one strong singular value and another very weak singular value 



1 
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Fig. 1. The diversity order as a linear function of tlie multiplexing gain r for M = 4, N = 3 and K = 1, 2, 2.5 and 3. 



MN Fnr iM-l+l){N-l+l) , 



1< K < 



{M-l){N-l) 
N+M-l-2-l 



+ 1, I = 1, 



,L-1 



CoroUary 1. ForO<K < j^^j^ we get d*j^{0) 
we get d*j^{l) = {M ~1){N ~ I). 

Proof: The proof is straight forward from d*j^{r) properties. ■ 
From Corollary [T] we get that the range of K can be divided into segments, where for each segment we have a set of straight 
lines, that are all equal at a certain integer point. Note that at these points, we get the same values as the optimal DMT for 
finite constellations. 



Corollary 2. 

and gives 



where 1 = 0, 



In the range I < r < I 



1, the maximal possible diversity order is achieved at dimension Ki = ^^l^j^i^^_2.i 



I 



d*K^{r) = {M-l){N-l) 



Ki 



Ki , r , 
— ^ 1 

r Ki' 



(M - 1){N -l)-{r- 1){N + M -2-1-1) 



,L-1. 



Proof: The proof is straight forward from d*j^{r) properties. ■ 
From Corollary |2] we can see that (/) = (M - 1){N - I) and d*j^^ {l + l) = {M - I - 1){N - I - 1). We also know that 
djii ('") is ^ straight line. Also, the optimal DMT for finite constellations consists of a straight line in the range I < r < I + 1, 
that equals (N - 1){M - I) when r = I and {M - I - 1){N - I - 1) when r = I + 1. Hence, in the range Z < r < ^ + 1 for 
+ /, we get an upper bound that equals to the optimal DMT of finite constellations presented in 1|3J. As for 



K, = 



N+M-l-2-l 
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each / = 0, . . . , i — 1, we have such Ki, taking 

max dt-fr) < r < i 

0<K<L 

gives us the optimal DMT for finite constellations. 

Figure [T] illustrates the properties of d*j^{r) following Corrolaries [T] |2] We take the example of M = 4, = 3. For 
< X < 2 we get upper bounds that have diversity order 12 for r = 0. We can see that in the range < r < 1, the upper 
bound of if 2 is maximal and equals to the optimal DMT of finite constellations. In the range 2 < K < 2.5 we can see that 
the upper bounds have the same diversity order 6 at r = 1. In the range 1 < r < 2, the upper bound of = 2.5 is maximal 
and equals to the optimal DMT of finite constellations in this range. For 2.5 < K <2>, the upper bounds equal to 2 at r = 2. 
In the range 2 < r < 3, the upper bound of A' = 3 is maximal and again equals to the optimal DMT of finite constellations 
in this range. 

13 
12 
11 
10 
9 
8 



^0 0.5 1 1.5 2 2.5 3 




Fig. 2. d*j^ (0) as a function of the IC dimensions per cliannel use K, for Af = 4, TV = 3. 

Figure |2] presents the maximal diversity order that can be attained for different average number of dimensions per channel 
use, for the case M ~ A and iV = 3, i.e. the upper bound on the diversity order for r — Q, d*j^{Q), where Q < K < i. \n 
the range < if < 2 we get dj^ (0) = 12. It coincides with the result presented in Figure [T] where we showed that in this 
range the straight lines have the same value for r = 0. Hence, for IC's, one can use up to 2 average number of dimensions 
per channel use without compromising the diversity order Starting from K >2, the tradeoff starts to kick-in and the maximal 
diversity order starts to reduce as we increase the average number of dimensions per channel use. Also note that for if = 3 
the diversity order is 6 when r — Q. 

IV. Attaining the Best Diversity Order 
In this section we show that the upper bound derived in section |lll] is achievable by a sequence of IC's in general and lattices 
in particular First we present a transmission scheme for any M, N, Ki = '}^^m^^^2\ ^ ^'^'^ = 7V + M — 1 — 2-^, 
I ^ 0, . . . , L — 1, where as previously defined L — min(M, N). Then we introduce the effective channel of the transmission 
scheme. Following that we extend the methods presented in fT3l, in order to derive an upper bound on the average decoding 
error probability of ensemble of IC's, for each channel realization. By averaging the upper bound over the channel realizations, 
we find the achievable DMT of IC's at these dimensions and show that it coincides with the optimal DMT of finite constellations. 
Finally, we discuss peak to average properties of the transmission scheme, and show that there exists a single sequence of IC's 
that attains the optimal DMT. 

A. The Transmission Scheme 

The transmission matrix Gi, 1 = 0,..., has M rows that represent the transmission antennas, and Ti = N + M— 1 — 2-1 
columns that represent the number of channel uses. 

We begin by describing the transmission matrix structure in general for any M and TV. 

1) For N > M and ifM_i '^'^^J'Sm^j^'^ = M- the matrix Gm-i has N ~ AI + 1 columns (channel uses). In 
the first column transmit symbols xi,. .. ,xm on the M antennas, and in the — Ai + 1 column transmit symbols 
XMiN-M)+i, ■ ■ ■,XMiN-M+i) on the M antennas. 

2) For M > N and Kn-i = ^m-^/v+i^'' ~ matrix Gn-i has M — N + \ columns. In the first column transmit 
symbols xi, . . . , xn on antennas 1, . . . , and in the AI — N + 1 column transmit symbols xn{m-n)+i, • • ■ , a^A'(A/-Ar+i) 
on antennas M - N + 1, . . . ,M. 

3) For Ki, I = 0, . . . , L — 2: the matrix Gi has M + N — 1 — 2 ■ I columns. We add to Gi+i, the transmission scheme of 
/f;+i, two columns in order to get G;. In the first added column transmit / + 1 symbols on antennas 1, ...,/ + 1. In the 
second added column transmit different / + 1 symbols on antennas M — I, . . . ,M. 
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Example: M — A, N = 3. In this case the transmission scheme for K — 3, 2.5 and 2 (G2, Gi and Go respectively) is as 
follows: 



Xi 





X7 





Xll 


' \ 


X2 


X4 


Xs 













X5 





Xg 











xe 





Xio 





2^12 / 



(16) 




til-— 



B. The Effective Channel 

Next we define the effective channel matrix induced by the transmission scheme. In accordance with the channel model from 
(O, the multiplication H ■ Gi yields a matrix with N rows and T; columns, where each column equals to H ■ x^., t = 1 .. .Ti, 
as in (|2|l. We are interested in transmitting /^iTj-complex dimensional IC with KiTi complex symbols. Hence, in the proposed 
transmission scheme, Gi has exactly KiTi non-zero complex entries that represent the i^TfT; -complex dimensional IC within 
C*^^'. For each column of Gi, denoted hy g., i = 1 . . . Ti, we define the effective channel that g. sees as Hi. It consists of 
the columns of H that correspond to the non-zero entries of g., i.e. H ■ g. = Hi ■ 'g., where 'g. equals the non-zero entries of 
g^. As an example assume without loss of generality that the first li entries of g^ are not zero. In this case Hi is an iV x Zj 
matrix equals to the first U columns of H. In accordance with (|3]l, H^''^ is an NTi x KiTi block diagonal matrix consisting of 
Ti blocks. Each block corresponds to the multiplication of H with different column of Gi, i.e. Hi is the i'th block of H^''^. 
Note that in the effective matrix NTi > KiTi. 

We would like to elaborate on the structure of the blocks of iJ^^g. For this reason we denote the columns of H as h^, 
i = l,...,M. 

1) The case where N > M . For this case the transmission scheme has N + M — \ — 2 - l columns. The first iV — il/ + 1 
columns of Gi, g^, . . . , g^ m+V ^'^^^^^^ ^'^ ' (-^ — M + 1) different complex symbols, i.e. there are no zero entries in 
these columns. Hence, in this case the first N — M + 1 blocks of H^^ are 

H,=H i^l,-- - ,N-M + 1. (17) 

After the first N — AI + 1 columns we have M — 1 — I pairs of columns. For each pair we have 

HN-AI+2k = {bn, ■ ■ ■ :h]^d-k} (18) 

and 

HN-M+2k+i = {hk+iT ■ -^hm} (19) 

where k ^ I, . . . , M - I - I. 

2) The case where M > N. Again the transmission scheme has iV + M — 1 — 2 • Z columns. By the definition of the first 
M — N + 1 columns of Gi, we get that 

H,^{h^,...,hj^+^^,} i = l,--- ,M-N + 1. (20) 

We have additional N — 1 — I pairs of columns in G; . For each of these pairs we get 

HM-N+2k — {ki, . . . ,h^_l.} (21) 

and 

HM-N+2k+l = {llM-N+k+l^ ■ ■ ■ t!1m} (22) 

where fc = 1, . . . , - 1 - /. 

Example: consider M = 4, A = 3 as presented in (fT6] l. In this case Z = 0, 1, 2 and we have K2 = 3, Ki = 2.5 and Kq = 2 
respectively. 

1) K2 ~ 3: H^ff is generated from the multiplication of the 3x4 matrix H with the first two columns of the transmission 

(2) 

matrix. In this case H^^ is a 6 x 6 block diagonal matrix, consisting of two blocks. Each block is a 3 x 3 matrix. We 
get that Hi = {hi, h^, h^} and H2 = {hajh:^,!^^}- 

2) Ki = ^ ~ 2.5: nj^^ is a 12 X 10 block diagonal matrix consisting of 4 blocks. The first two blocks are identical 

(2) 

to the blocks of H^^. The additional two blocks (multiplication with columns 3-4) are 3 x 2 matrices. We get that 

H3 = {hLi,!l2} H4 = {/i3,/i4}. 
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(23) 



3) = 2: consists of six blocks. In this case the last two blocks ai'e 3 x 1 vectors. We get that iJs = and 

We present uf^ of our example in equation (|23] |. Note that G C^^ for 1 < i < 4, and is a 3 x 1 vector 

From the sequential construction of the blocks of h'^^^ (fT7]l-(fT9]l. (|20]|-(|22]| it is easy to see that when two columns of H 
occur in a certain block of tlf^, the columns of H between them must also occur in the same block, i.e. if h^, occur in 
a certain block, then h2,h3,h^ also occur in the same block. Next we prove a property of the transmission scheme Gi, that 
relates to the number of occurrences of the columns of H in the blocks of H^g . For each set of columns in H, we give an 
upper bound on the amount of its appearances in different blocks. 

Lemma 1. Consider the transmission scheme Gi, I — 0, ... L — 1. In case < i ~ j < L, the columns hj, . . . ,hi may occur 
together in at most N — i + j blocks of H^g. In case i — j > L they can not occur together in any block ofH^J^. 

Proof: See appendix O ■ 



C. Upper Bound on The Error Probability 

Next we would like to derive an upper bound on the average decoding error probability of ensemble of /-C/Tj-complex 
dimensional IC, for each channel realization. We define \H^f^ H^f^ \ = p^^'=i , where is the i'th singular value of 

H^s, 1 < « < KiTi. We also define rj ^ {rji, t^k^TiV ■ Note that NTi > Km. 

Theorem 3. There exists a sequence of KiTi-complex dimensional IC's, with channel realization H^^l and a receiver VNR 
fire — P ^' , that has an average decoding error probability 

where D{KiTi) is a constant independent of p, and rji > for every 1 < i < KiTi. 

Proof: We base our proof on the techniques developed by Poltyrev [13] for the AWGN channel. However, the channel 
considered here is colored. In spite of that, we show that what affects the average decoding error probability is the singular 
values product, which is encapsulated by the receiver VNR, prc- This observation enables us to facilitate this colored channel 
analysis. 

Based on lfT3l we have the following upper bound on the maximum-HkeHhood (ML) decoding error probability of each 
i^jT; -complex dimensional IC point x G SkiTi 

Pe{x)<Pr{\\n,^\\>R)+ Pr{\\l^x^n,,\\<\\n,.,\\) (24) 

where Ball{x_ , 2R) is a i^'jTi -complex dimensional ball of radius 2R centered around x_ , and n^^ is the effective noise in the 
7^iT/-complex dimensional hyperplane where the IC's resides. Note that the second term in (l24l i represents the pairwise error 
probability to points within Ball{x , 2R), i.e. the decision region is at distance R at most. 

Next we upper bound the average decoding error probability of an ensemble of constellations drawn uniformly within 
cubcKiTiib). Each code-book contains [7tr6^^'"^' J points, where each point is drawn uniformly within cubeKiTi{b). In the 
receiver, the random ensemble is uniformly distributed within {H^ig ■ mhc KiTiib)}. Let us consider a certain point, x £ 
{H^g -cuheKiTi (b)}, from the random ensemble in the receiver We denote the ring around x by Ring{x_ , «A) — Ball{x , jA)\ 
Ball{x , (i — 1)A). The average number of points within Ring{x , iA) of the random ensemble is 

Av{x ,jA) = 7re|i/i^) • cnheK,TAb)nRingix,tA)\ < j,,\Ringix ,tA)\ < ' {tAf^^^'~' A (25) 

' ' I [Kill + L) 

where using the upper bounds on the error probability d24l . and the average number of points within 

the rings dZST l. we get for a certain channel realization the following upper bound on the average decoding error probability of 
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the finite constellations ensemble, at point x 

PFix,p,ri) < Pri\\n,J > R) + ^,,Q{KiTi) J2 > ^^^^) ' m^'^'^'-'A (26) 

where Q{KiTi) — YiW^T^+lf' ^^'^ fi-ex,! is the first component of n^^ (the pairwise error probability has scalar decision 
region). By taking A we get 

, /-sfl ^ 

Pf^ix ,p,r2)< Pr{\\n^J > R)+j,,Q{KiTi) / Pr(ne.,i > -)x^'''^'~^dx. (27) 

Jo ^ 

Note that this upper bound applies for any value of i? > and h, and does not depend on x_ , i.e. Pf^{x ,p,ri) = Pf^{p, r/). 

Now we divide the channel realization into two subsets: ^ = {rj \ J2i^=li' Vi ^ Ti{Ki—r),rii > 0}, where = {rji, . . . ,r]KiT,) 
and A = {rj \ J2i^=i^ Vi > 'Pi{Ki — r),r]i > 0}. For each set we upper bound the error probability. We begin with the case 
1] E A. For this case we upper bound the terms in (l27t and find an upper bound on the error probabihty as a function of the 

receiver VNR, fire — P ■ We begin by upper bounding the integral of the second term in dZTl ). Note that 

X _ 

Pr{nc^A > <e i^. 
Hence, the integral in the second term in (l27t can be upper bounded by 

where 

cr-in}i\T{KiTi)2^'^i'^ ' i-'^ ~ -P''(ll^cx|| < 2i?) < 1. As a rcsult we get the following upper bound 

r2R. 



I Priuex.i > ^)x''''^'-'dx < a^^'^T(i^,T023^' 



3K,T,-2 



(28) 



By assigning this upper bound in the second term of (|27] | we get 



2' - r(KiTi + l) 2e^'T'' 

(29) 

Next we upper bound Pr(||7T,cx|| > R), the first term in (l27t . We choose 
For r/ e we get that 

By using the upper bounds from f\3\, we know that for the case 
Hence we get 

Pr(||nex|l > Pcff) < e-^'^'p"^''''^^' ■ p'^'i^'-^-^^^? • e^'^' . (30) 

The fact that rj E A has two significant consequences: the VNR is greater or equal to 1, and as p increases the maximal VNR 
in the set also increases. For very large VNR in the receiver, the upper bound of the first term, (l3Qt , is negligible compared 
to the upper bound on the second term, (|29t . On the other hand, the set of rather small VNR values is fixed for increasing p 
(the VNR is grater or equal to 1). Hence there must exist a coefficient D (KiTi) that gives us 

If^ip, 77) < D iKm)p'^'^'''-^^+^^^^' (31) 



for any p and 77 S A, where Pf^{p, 77) is the average decoding error probability of the ensemble of constellations, for a certain 
channel realizations. 

Note that we could also take R > Rcff, as the upper bound in ( |29] l does not depend on R and the upper bound in (l30l l would 
only decrease in this case. It results from the fact that we are interested in the exponential behavior of the error probability, 
and we consider a fixed VNR (as a function of p) as an outage event. This allows us to take cruder bounds than L13J in 
that do not depend on R. 
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For the case ij £ A, we get 

p-Ti(^<'i-'-)+E£rs. > 1. 

Hence, we can upper bound the error probabiUty for 77 G by 1. We can also upper bound the error probabiUty for this case 
by the upper bound from equation (ISTT l. as long as we state that D [KiT) > 1. Hence, the upper bound from (l3Tt applies for 

V^>0,l<i< KlTi. 

So far we upper bounded the average decoding error probability of the ensemble of finite constellations. We extend now 
these finite constellations into an ensemble of IC's with density ^tr, and show that the upper bound on the average decoding 
error probability does not change. Let us consider a certain finite constellation, Co{p,b) C cubeKiTiib), from the random 
ensemble. We extend it into IC 

IC{p, KiTi) - Co(p, b) + {b + b)- Z^^'^' (32) 
where without loss of generahty we assumed that cubcKiTi (b) € C^'^'. In the receiver we have 

IC{p, KiTi,H^J^) = H^Jf^ ■ Co{p, b) + {b + b')H^^ ■ Z^^'^' . (33) 



By extending each finite constellation in the ensemble into an IC according to the method presented in (132) . we get a new 
ensemble of IC's. We would like to set b and b to be large enough such that the IC's ensemble average decoding error 
probability has the same upper bound as in (l3ll . and a density that equals jrc up to a coefficient. First we would like to set a 
value for b . Increasing b decreases the error probability inflicted by the codewords outside the set {iJ^'g • Co{p, b)}. Without 
loss of generality, we upper bound the error probability of the points x £ {H^^ ■ Coip,b)} C IC{p, KiTi, H^^)^ denoted by 
Pl'^iH^cS ■ C'o)- Due to the tiling symmetry, P^^iH^^ ■ Co) is also the average decoding error probability of the entire IC. 
We begin with r/ e A. For this case, we upper bound the IC error probability in the following manner 

Pl'^iH^J^ ■ Co) < P^iH^^ ■ Co) + P.{H^^ ■ {IC \ Co)) 

where Pf^iH^Ji} • Co) is the error probabihty of the finite constellation {H^J'f^ ■ Co}, and Pe{H^g ■ {IC \ Co)) is the average 
decoding error probability to points in the set {H^g ■ {IC \ Co)}- For the case r/ e A, we know that < iji < Ti{Ki — r). 
Hence, the constriction caused by the channel in each dimension can not be smaller than p~^(^'^^\ A s a re sult, for any 

xi e {iJ^I^ • Co} and X2 € {H^g ■ {IC \ Co)} we get \\x-^ - x^W > 2b' ■ p-^(^i~^). By choosing 6' = 

we get for 2 e ^ that \\xi — X2II > 2^ ^tT'P'^- Hence we get 

Pe(<^ • (/C\Co)) < Pr{\\nJ\ > \I^P% 
For p > 1 we get according to the bounds in lfT3l that 

Pr 



^^Jl > .[^p-)) < e-^'^'P^+>^''^'(i+^)e^'^'. 
V 7re 

As a result, there exists a coefficient D {KiTi) such that 

Pe(i/iff^ • (/C\ Co)) < d" {KiTi)p-^'^'''-^^+^^-"' 

for rj £ A and p > I. This bound applies for any IC in the ensemble. From (l3Tl i we can state that Pf~'{p, rj) — Eco iPf'^{Hes ' 
Cof) < D' {KiTi)p-'^''^^^-'-^+^f=i' Hence 

Mp.rj) < D{KiTi)p-^^^''^-^^+^"=^' (34) 

where Pe{p, ?/) = Ec„ {P^'^ {H^g ■ Co)) is the average decoding error probability of the ensemble of IC's defined in ( [33] l. and 

D = 2max{D',D") > 1. 

Next, we set the value of b to be large enough such that each IC density from the ensemble in ( l33T l, 7^^, equals jrc up to 
a factor of 2. By choosing 6 = 6 • p'^ we get 

Ire ~ I'TcK , , ) — Ire 



'b + b" ' 

For each value p > 1, we get \"frc < Ire ^ Ire- As a result we have 
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Note that in our proof we referred to a matrix of dimension NTi x KiTi. However these resuhs apply for any full rank matrix 
with number of rows which is greater or equal to the number of columns. ■ 
By averaging arguments we know that there exists a sequence of IC's that satisfies these requirements. 

D. Achieving the Optimal DMT 

In this subsection we calculate the DMT of the proposed transmission scheme. We upper bound the determinant of the 
effective channel inverse, \H^^'^ h''^^\^^ , based on the effective channel properties presented in subsection IIV-BI In Theorem 
[3] we showed that the upper bound on the error probability depends on this determinant. Hence, the upper bound on the 
determinant gives us a new upper bound on the average decoding error probability. We average the new upper bound over all 
channel realizations and get the DMT of the transmission scheme. 

The channel matrix H consists of • M i.i.d entries, where each entry has distribution hij ^ CN{0, 1). Without loss of 
generality we consider the case where the columns of H are drawn sequentially from left to right, i.e. hi is drawn first, then 
/i2 is drawn et cetera. Column hj is an A^-dimensional vector. Given Zimax(i,i-7V+i); ■ • ■ i^j-i' we can write 

hj — 6(Zimax(lj-Ar+l); ■ • ■ ^llj-l) ' tij 

where 0(-) is an x iV unitary matrix. &{■) is chosen such that: 

1) The first element of hj, hij, is in the direction of hj_i. 

2) The second element, /i2j, is in the direction orthogonal to hj^i, in the hyperplane spanned by {h.j_i,hj_2\- 

3) Element /imin(i.A')-i.j is in the direction orthogonal to the hyperplane spanned by {Ziniax(2 ^-^+2)1 ■ • ■ ^i^j-i) inside 
the hyperplane spanned by {Zimax(i j-w+i) > ■ • ■ Aj-i)- 

4) The rest of the N — min(j, N) + 1 elements are in directions orthogonal to the hyperplane {Zimax(i j-N+i)^ ■ ■ ■ iZij-i}- 

Note that hi,j, 1 < i < N, 1 < j < M sae i.i.d random variables with distribution CN{0, 1). Let us denote by hj^j_i 

the component of hj which resides in the N — k subspace which is perpendicular to the space spanned by {hj_i, ■ ■ ■ ,hj_f^}. 

In this case we get 

N 

^ E l^^^l' 1 <^<minO-,iV)-l. (35) 

i=k+l 

If we assign — p^^'-K we get that the probability density function (PDF) of is 

/(6,,)=C-logp-p-«-- -e-""'-^ (36) 

where C is a normalization factor In our analysis we assume a very large value for p. Hence we can neglect events where 
< since in this case the PDF ( l36b decreases exponentially as a function of p. For a very large p, > 0, 1 < i < iV 
and 1 < j < M, the PDF takes the following form 

/(C.,,) (X > 0. (37) 

In this case by assigning in ( |35] ) the vector = (Cijj • ■ • tS.nj)'^, whose PDF is proportional to p~^i=i?' 3, we get 

where 1 < A; < mm{j,L) — 1 and a{k,(_^) = mms^{k+i....,N} ^s,j- In addition 

Note that 

a(min(j, L) - 1,^^ > • • • > 0(0,^^ > 0. (40) 

Next we wish to quantify the contribution of a certain column in the channel matrix, hj, to the determinant |. 
is a block diagonal matrix. Hence the determinant of \Hj,''^^ H^''^\ can be expressed as 

1=1 

Assume Hi — {h-^, . . . ,h^), i.e. Hi has m columns. In this case we can state that the determinant 

|i?/i?d-ll^ilPll^2±iir---lll„±™-i,...,ill'- 

Note that Hi also has more rows than columns. The columns of Hi are subset of the columns of the channel matrix H. Hence 
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we are interested in the blocks where occurs. We know that the contribution of to those determinants can be quantified 
by taking into account the columns to its left in each block. We consider two cases: 

• The case N > M. In this case we can see from (fT7]i-(fT9]l that may occur with {hi, . . . to its left in different 
blocks. 

• The case M > N. In this case we can see from (|20||- (|22] | that hj may occur only with {Zimax(ij-Af+i)i • • • ^hj-i} to its 
left in different blocks. 

Based on (|38]) and ( [39] ) we can quantify the contribution of hj to \H^Jg^ H^g \ by 

min(j.L) — 1 

wkr^'"^ n iiv.-i,...,.-.ii''^-''^-p"^™""'""'^'^'^'^''-^^ (42) 

where bj{k) is the number of occurrences of in the blocks of with only {hj_i, . . . ,hj_^.} to its left. bj{0) is the 
number of occurrences of with no columns to its left. Note that from the definition of the transmission scheme we get that 
for I = 0, bj{k) > for 1 < fc < min(j, L) - 1. 

In the following theorem we calculate the DMT of the proposed transmission scheme. 

Theorem 4. There exists a sequence of KiTi-complex dimensional IC's with transmitter density jtr = P^'^^ <^nd Ti channel 
uses that has diversity order 

dK,T, {r) > (M - 1){N -l)-{r- 1){N + Af - 2 • / - 1) 

where < r < Ki and Z = 0, . . . , L — 1. 

Proof: The proof outline is as follows. The upper bound on the error probability from Theorem |3]depends on |~^. 
We upper bound this determinant value and average over different realizations of H^^ in order to find the diversity order of 
the transmission matrix G;. We begin by lower bounding |. Based on the sequential structure of G;, we lower bound 

the contribution of a certain column of H, hj, 1 < j < M to the determinant. This gives us a new upper bound on the 
error probability for each channel realization. We average the new upper bound on the error probability, by averaging over 
hi, ■ ■ ■ 1 hhi ■ From this averaging we get the required diversity order. 

Specifically, we first lower bound the contribution of to the determinant (l42T i. by upper bounding X^T^o'^^^"^ {k)a{k, ^^). 

Based on Lemma[T] and the fact that when two columns of H occur together in a block of H^^g, all the columns of H between 
them must also occur in the same block, we get 

min(j\L)-l 

bj{s) <N-k 0<k< min(j, L) - 1. (43) 

s—k 

where X^^fc^"''^^ ^ ^ji-'') ^^e number of occurrences of {h^, . . . ,hj_f.} in the blocks of H^^. Hence, we can state that 

min(_7,L) — 1 

E b,{s)<N 

by assigning A: = in ( l43T l. Also note that for / ~ 0, the sum X)"=o'^"'^' ^ {s)a{s, ^ . ) is larger than for any other 1 < ^ < L— 1. 
From the inequalities in ( l40l l. and the fact that for / = we get bj{k) > for any 1 < fc < min(j, L) — 1, we can state that 

min(j,L) — l min(j,L)— 2 

bj{s)a{s,^^)< Y a(s,ip + (A^-minO-,i) + l)a(min(j,i)-l,ep-c(j). (44) 

s=0 s=0 

Using (l42T l and ( l44l i we can state that for a vector whose PDF is proportional to ^Z.li Ci j^ we can lower bound the 
contribution of to \H^Ji''H^Ji \ by 

min(j,L)-l 

fe=l 

By taking into account the contribution of each column to the determinant we get that 

M min(j,L)-l 

J=l fe=l 

By considering the set of vectors ^ , . . . , ^ , whose PDF is proportional to p ^3=^ 2^i=i ^ and by using the lower bound 
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from (|45T l we get 

|<^^<^|>p-^-^^(^-) (47) 
The upper bound on the error probabihty presented in Theorem [3] is proportional to 

p~TdK,-r) . |iy(Ot^(|)|-l ^ ^-T,(K,-r)+Efjr^. (48) 

for > and 1 < i < KiTi, where p 2 are the singular values of H]^^. Hence, in order to use the upper bound from 
Theorem [3] in our analysis, we need to show that by taking j > 0, 1 < i < A^, 1 < j < A/ we also get that rji > 0, 
1 < i < KiTi. Note that the entries of H^^ are elements of the channel matrix H. Also, all the columns of H must appear 
in H^g. Hence, from trace considerations we get 



KiTi 

As a result mini.j(^i.j) > if and only if mins(77s) > 0, and so ?7s > for every 1 < s < KiTi. As the upper bound on 
the error probability in (l4Ft applies for rji > Q, \ < i < KiTi, this upper bound also applies whenever j > 0, 1 < i < iV 
and 1 < J < M. In equation ( |47| | we found a lower bound on the determinant. We use this lower bound to upper bound the 
determinant of the matrix inverse \H^2^ H^2\~^ 

and as a consequence we can upper bound the error probability. 

We can express the average decoding error probability over the ensemble of IC's for large p as follows 

P:ip)= f Peip,H)f{H)dH= f Pe{p,C^.J)f{C^..M^.J (50) 

Jh Jii.j>o - - - 

where P^{p,H) — Peip,^i.j) is the ensemble average decoding error probability per channel realization, and > means 
^ for 1 < i < and 1 < j < M. We divide the integration range into two sets: A — \ X^i^i X^jii Ci.j ^ 

Ti{Ki - r);4j > 0} and A = {6,, | Eti EjLi ^^,3 > Ti{Ki - r);C^j > 0}. Hence, we can write the average decoding 
error probability as follows 



Pe{p)= Peip,C.j)fi^,^M,,j+ _Pe{p,C.j)f{i,,M,,j. (51) 

We begin by upper bounding the first term of the error probability in ( BTT l. Based on Theorem [5] the average decoding error 
probability per channel realization is upper bounded by Pe{p,H) < p^^'(^'^'")+SiJi ' ii_ Using the upper bound on the 
determinant (|49] | and the fact that \H^g^ H^''g\^^ — p^i=i ' ^\ we get that the first term of the error probability dSTl is upper 
bounded by 

f ^-T,(K,-r)+i:-,(cO)-E£,«..)rf^^_^, (52) 



Now we prove a Lemma that shows that the exponent of the integrand in the upper bound from (|52| | is negative for j > 0. 
Lemma 2. consider j > for 1 < i < N and 1 < j < M. The sum 

N 
i=l 

for every 1 < j < M. 

Proof: See appendix IdI ■ 
In a similar manner to |3 |, |7|, for a very large p and a finite integration range, we can approximate the integral by finding 
the most dominant exponential term in (l52l l. Based on Lemma|2]we know that the exponent of the integrand is always negative. 
Hence, we can approximate the upper bound by finding 

M N 

mill Ti{Ki -r)+ ^(^^ 6,, " cO'))- 

As X^i^i ^i-j ~ "^(j) — the minimum is achieved when X]t=i ~ = for 1 < j < M. This can be achieved for 
instance by taking j — for l<j<iV, l<j< M. In this case we get that the diversity order equals Ti(Ki — r) which 
is the best diversity order possible for IC's of complex dimension KiTi. 
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Next we upper bound the second term of the error probabihty from (ISTT i. For S^ij e ^ we upper bound the average decoding 
error probabihty per channel realization by 1 . In this case we get 



Again we approximate this integral by calculating the most dominant exponential term, i.e. min^ eA^i^i The 
minimal value for this case is also Ti{Ki — r). Hence, we get a diversity order Ti{Ki — r) for the second term. As a result 
we can state that for both terms in (|5ll we get the same diversity order, and the transmission scheme diversity order is upper 
bounded by Ti{Ki — r). The proof is concluded. ■ 

The diversity order attained in Theorem @] for Ki, Ti coincides with the optimal DMT of finite constellations in the range 
I < r <l + 1. Hence, by considering < ^ < ^ 1, we can attain the optimal DMT with L sequences of IC's. 

We present as an illustrative example the case of M = TV = 2. Let us consider the case where Z = 0. In this case Kq = |, 
and To = 3, i.e. we transmit 4-complex dimensional IC. The transmission scheme diversity order in this case is 4 — 3r, 
< r < |. In this case the effective channel matrix, , consists of three blocks: Hi = (/ii, /12), H2 = hi and = h^. 
According to our definitions 

\H\Hi\ - WKif ■ Wh^^if - . p-i.a 

and also = p- min(?i, 1,52,1)^ = min(Si.2,C2,2)^ jjj accordance with dsTl i we divide the integral into two terms. In 

the first term we solve the optimization problem 

2 2 

min (4 - 3r) - (^2,2 + 2 • min (6,1, 6,1) + mm (6,2, 6,2)) + ^^^^ 

One solution to this problem is 6.j = for 1 < « < 2, 1 < j < 2. In this case we get an exponential term that equals 4 — 3r. 
For the second integral we solve the optimization problem 

2 2 

^ijt-^ i=i j=i 

In this case the optimization problem solution is X]?=i T^j=i ^i-j = 4 — 3r. Hence, all together, we get a diversity order that 
equals 4 — 3r, that coincides with the optimal DMT of finite constellations in the range < r < 1. 

In the next theorem we prove the existence of a sequence of lattices that has the same lower bound as in Theorem |4] 

Theorem 5. There exists a sequence of 2KiTi-real dimensional lattices with transmitter density ^tr — P^'^^ ond Ti channel 
uses, that attains a diversity order 

dK,T, (r) > (M - 1){N -l)-{r- 1){N + M -2-1-1) 

where < r < Ki and I — 0, . . . , L — 1. 

Proof: See appendix |E] ■ 
Note that we considered a 2/'ir;r;-real dimensional lattice, where the lattice first KiTi dimensions are spread over the real 
part of the non-zero entries of Gi, and the other KiTi dimensions of the lattice are spread on the imaginary part of the non-zero 
entries of G;. This does not necessarily yields a i^;T;-complex dimensional lattice in the transission scheme. Considering the 
2KiTi-Teal dimensional lattice enables us to use the Minkowski-Hlawaka-Siegel Theorem 1131 , 1191 , and prove Theorem |5] 



E. Power Peak to Average Ratio 

For practical reasons, such as power peak to average ratio, one may prefer to have a transmission scheme that spreads the 
transmitted power equally over time and space. The transmitting matrix Gi contains exactly KiTi non-zero entries, where 
the rest of the entries are zero. In order to spread the power more equally over time and space we use the following unitary 
operations 

UlGiUr. 

Ul is an M x M unitary matrix that spreads each column of Gi, i.e. spreads over space. Uu is a T; x T; unitary matrix that 
spreads each raw of Gi, i.e. spreads over time. As the distribution of H and H ■ Ul are identical, multiplying Ul with Gi 
gives exactly the same performance. Based on the notations from (|2]l we can state that 

Gi ■ Un. = {xi, . . . 
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where (aii, ■ • ■ , aij-J are the channel inputs. In the receiver we can state that the received signals are (y^, . . . , ). By 

t — — I 

multiplying with C/jL, we get 

{y^,...,y^) ■Ul^Gi + {n„...,nr^)ul 

The distribution of (rii, . . . ,nTi) identical to the distribution of (nj^, . . . jfirp^'jul^. Hence, multiplying Gi with Uu gives 
also exactly the same performance. For instance, in order to achieve full diversity and spread the power more uniformly, we 
take Go and duplicate its structure s times to create the transmission scheme Gq^\ In this case the transmission matrix Gq^'' 
consists of sKqTq complex non-zero entries, i.e we transmit an sKqTq complex dimensional IC within the sMTo complex 
space. Gn'*'' is an M x sTq dimensional matrix, that has exactly the same diversity order as Gq (it duplicates the structure of 

(s) (s) 

Go s times). Each row of Gq has exactly sN non-zero entries. We define Uj^ as sTq x sTq unitary matrix. For large enough 
s, the multiplication Gq'*^ • C/]^*^ spreads the power more uniformly over space and time, and still achieves full diversity. H 

F. Averaging Arguments 

In this subsection we show that there exist L sequences of lattices that attain the optimal DMT, where each sequence of 
the L sequences attains a different segment on the optimal DMT curve. In addition we show that there exists a single IC that 
attains the optimal DMT by diluting its points and adapting its dimensionality. 

As a consequence of Theorem |3] and Theorem |4] we can state the following 

Corollary 3. Consider a sequence of KT -complex dimensional IC's Skt{p) widi density ^tr — 1, that attains diversity order 
d. This sequence of IC's also attains diversity order d{l — -^) when the sequence density is scaled to -ftr = P^"'" ■ 

Proof: Let Pe{S{p),r) denote the average decoding error probability of the IC S{p) with density jtr — P^^ ■ Since Skt{p) 
has density ^tr — 1 for every p, this IC's sequence has multiplexing gain r = 0. Hence, in accordance with our definitions, 
we denote Skt{p) average decoding error probability by Pe{Skt{p)i^)- Assume 

P,{Skt{p),())=A{p)p-'' 

where — limp^oo logp Pe(S'i<-T(p), 0) — d, i.e. Skt{p) has diversity order d. By scaUng the sequence of IC's such that 

Skt{p)^Skt{p)- P~^ 0<r<K, 

i.e., scaling Skt{p) by a factor of p^^ , we get that Skt{p) has density "ftr — p*^^, multiplexing gain r and so its error 
probability 

Pe(SKT{p).r) ^ P,{Skt[p'-*),0) ^ A {p^-^)p-''^^-*'^ ■ 

As a result we get — limp_>oo logp Pe{SKT{p), r) = d{l — -^), i.e. Skt{p) has diversity order d{l — j^). ■ 

Corollary 4. The optimal DMT is attained by exactly L sequences of 2KiTi-real dimensional lattices, I — 0, . . . , L ~ 1, where 
each sequence attains different segment of the optimal DMT. 

Proof: From Theorem|5]we know that there exists a 2KiTi-Teal dimensional sequence of lattices with density -ftr = 1 that 
attains diversity {M — I) {N — 1) + 1{N + M — 2-1 — 1). Hence, based on Corollary |3] we can scale this 2KiTi-Teal dimensional 
sequence of lattices into a sequence of lattices with density jtr — p^'^\ and a diversity order (Af — 1){N — I) — {r — 1){N + 
M — 2 ■ I — 1), i.e. the sequence of lattices attains the optimal DMT line in the range I < r < I + 1. The optimal DMT is the 
maximal value of the L lines, for each < r < L. Hence, there exist L sequences of lattices that attain the optimal DMT. ■ 
Next, we show that there exists a single sequence of IC's that attains the optimal DMT. The optimal DMT consists of L 
segments of straight lines. Each segment is attained by reducing the IC's dimensionality to the correct dimension, and diluting 
their points to get the desired density. Note that in Theorem |4] we showed that for each multiplexing gain, r, there exists a 
sequence of IC's that attains the optimal DMT. On the other hand, in Corollary |5] we show that a single sequence of IC's 
attains the optimal DMT for any r, by adapting its dimensionality and diluting its points. Also note that KqTq > KiTi > 
■ ■ ■> Kl-iTl-i. 

Corollary 5. There exists a single sequence of KoTg-complex dimensional IC's, that attains the L segments of the optimal 
DMT: 

(M - 1){N -l)-{r - 1){N + M -2-1-1) 0<r <Ki 

where I — 0, ■ ■ ■ , L — 1. The I'th segment is attained by reducing the IC's complex dimensionality to KiTi, and by diluting 
their points to get density jtr = p"^'^- 

Proof See Appendix |F] ■ 

'it can be shown that replacing f/j, and C/jj with any other two invertible matrices still yields transmission scheme that attains the optimal DMT. It extends 
the set of subspaces in C*^^ that attain the optimal DMT. It also alludes that alongside the proposed transmission matrix ITV-AI there are many other options 
to attain the optimal DMT. 
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V. Lattice Constellations Vs. Lattice Based Finite Constellations 

In this section we summarize our results and explain why lattice based coding schemes such as golden codes f9\, perfect 
codes 1 10] and cyclic-division algebra based space-time codes |6J are sub-optimal when considering regular lattice decoding, 
i.e. decoding without taking into consideration the finiteness of the codebook. In addition we explain why using the MMSE 
estimation in the receiver enables those schemes to attain the optimal DMT. Afterwards, based on our results, we give another 
geometrical interpretation to the optimal DMT. Finally, as in practice a finite codebook is transmitted, we show that finite 
constellation with multiplexing gain r, can be carved from a lattice with multiplexing gain r as defined for IC's in (|4]i. 



A. Lattice Based Coding Schemes with Lattice Decoding are Suboptimal 

In this subsection we consider lattice based space time codes such as perfect codes and cyclic division algebra based space 
time codes. In case there are M transmit antennas and T channel uses, these transmission schemes essentially transmit a 
finite constellation carved from an Af T-complex dimensional lattice, i.e. K = M. We explain why such transmission schemes 
are suboptimal when the decoder performs regular lattice decoding . On the other hand, we also explain while performing 
MMSE estimation followed by lattice decoding, as presented in Q, ([gl, enables these schemes to attain the optimal DMT. 
The explanation relies on the results presented in Theorems [T] |2] of this paper 

For simplicity let us consider the case N > M. First we show the best DMT those transmission schemes can achieve when 
employing regular lattice decoding. In this case for each channel realization the error probability is lower bounded by the 
probability that the observation is outside the effective ball. The effective ball is centered around the transmitted lattice point, 
and has volume that equals to the Voronoi region volume of the effective lattice in the receiver The effective lattice in the 
receiver consists of the multiplication of each lattice point of the transmission scheme, with the effective channel i/ox ©. The 
Voronoi region volume of the effective lattice is 

7rc r(A/r + 1) 

where Tcff is the effective radius of the effective ball. Based on Theorem [T] the probability that the observation is outside the 
effective ball is lower bounded by 

C{MT) ^^f^^^.A(MT) + (MT-l) In(Mrc) 
4 

and 



-e 

i=l 



Y,a,<{M-T) (55) 



C{MT) A(MT) 



M 



e 

i=l 



^ a, > (Af - r) (56) 



where for both (l55), (ED we have aM > ■ ■ ■ > ai > 0, A (MT) and C {MT) are defined in Theorem [T] and 

A*rc = P " ■ (57) 

The lower bound in (|55] l decays exponentially as a functions of p. However, the lower bound in ( |56] l is bounded away from 
zero for any p. Hence, for large p the question at hand is what is the most dominant realization for which the channel PDF 
p-^i=li^-^+'^^^-'^)°'i is maximized, when Yl!iLi'^i ^ (A/ — r) and um > ■ ■ ■ > ai > 0. The maximum is obtained 
for instance for an = • • ■ = Q!2 = and ai = M — r, which induce upper bound on the diversity order that equals to 
{N — A^ + 1) (A/ — r). This upper bound coincides with the optimal DMT of finite constellations only in the range Af — 1 < 
r < Af . Hence, when transmitting lattice based space time codes which are Af T-complex dimensional such as golden codes, 
perfect codes or cyclic division algebra based space time codes, the performance when considering regular lattice decoding in 
the receiver is suboptimal. 

For example consider the case AI = N = 2. In this case golden codes can be represented as a 4-complex dimensional lattice 
with T — 2, Hence, when considering regular lattice decoding, i.e. not taking into consideration the finiteness of the codebook 
by any way, the best diversity order that can be attained is 2 — r, where < r < 2. 

On the other hand when employing MMSE estimation followed by lattice decoding, the effective channel can be rewritten 
as . (f + p • fft^ • ffcx) ^ ■ In this case the additive noise is no longer Gaussian, and consists of the sum of the additive 
Gaussian noise and another component that depends on the transmitted codeword (that lies within the shaping region). The 
performance in this case can be upper bounded by considering Gaussian noise independent of the transmitted codeword, with 
vanishing increase in the variance value. In this case, based on Q, E), the lattice decoder faces for each channel realization 
volume to noise ratio that equals to 

Arc = P^ S£i(i-i&-"0^ (58) 
where (x)^ — x for x > and zero else. When averaging over the channel realizations we get Q, lHJ that performing the 
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(a) Finite constellation: In this case even when /12 is small it is possible to decode. 



X2 



h2X2 



hixi 

ojDOO-ffiCGO O-OS-O^^ 



(b) Full dimensional infinite constellaion: In this case due to the infiniteness of the constellation when h2 is very small it 
is impossible to decode. 




(c) Infinite constellaion with reduced dimension: In this case even when h2 is very small it is possible to decode. 



Fig. 3. Illustrative example for the case M = 2, N = 2 of the significance of reducing dimensions when considering regular lattice decoding. For this 
example we assume that the realization of H is diagonal, where the diagonal elements are hi and h2. 



MMSE estimation followed by lattice decoding enables to attain the optimal DMT. In continuation to the aforementioned 
example, golden codes will achieve the optimal DMT when performing the MMSE estimation followed by lattice decoding. 

A question that may arise is how the MMSE estimation assists in attaining the optimal DMT. The answer lies in the 
difference between ii^c ( l57] i and //jc (ISST l. When considering regular lattice decoding, a certain small singular value can inflict 
outage on the code. For instance in the aforementioned example when ai > M — r the error probability is bounded away 
from zero. It results from the fact that the lattice consists of infinite amount of codewords and all this codewords are equally 
likely. When the singular value is very small, there must exist lattice points with small distance from the transmitted codeword 
in the direction of this singular value. Since the decoder makes no assumptions on the finiteness of the codebook it can not 
separate the transmitted codeword from these points. However, when performing the MMSE estimation the transmitted power 
is taken into account. In this case the affect of each singular value is reduced to the T dimensions it occurs in. Reducing the 
affect of each small singular value to these T dimensions comes at the expense of a self additive noise that depends on the 
transmitted codeword. However, under the assumption that the transmitted codewords are within a bounded shaping region, 
the variance of the effective noise is such that the optimal DMT is attained. In a sense (as also presented in [8 |) performing 
MMSE estimation followed by lattice decoding yields good error performance for words within the shaping region. However, 
outside the shaping region the effective noise variance will increase with the lattice point norm, which eventually leads to poor 
error performance for lattice points far enough from the origin. On the other hand regular lattice decoding yields the same 
performance for all lattice points inside or outside the shaping region. 

In a certain sense when considering regular lattice decoding, reducing the lattice dimensionality takes the role of the MMSE 
estimation in enabling to attain the optimal DMT. Figure [5] illustrates the improvement in the channel, attained by reducing 
the lattice dimensionality, for the case M = 2, N = 2. 
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B. Geometrical Interpretation to the Optimal DMT 

In this subsection we give a geometrical interpretation to the optimal DMT, based on allocation of lattice dimensions. This 
is a qualitative discussion and the exact results appear in sections InD IIVI 

First from our results we can see that for a sequence of lattices with certain number of dimensions per channel use the DMT 
is a straight line as a function of the multiplexing gain (see Corollary |3). It results from the fact that for lattices changing the 
multiplexing gain is equivalent to scaling each dimension by p^^w , Assume that the sequence of lattices attains for multiplexing 
gain r = diversity order d, i.e. the error probability decays as p^'^. In this case scaling each dimension by p^^tt leads to 
error probability that decays as p^'^(^^"fr). This behavior results from the fact that the lattice decoder takes into account all 
the lattice points. Hence, the scaling merely replaces p with p^^T< in the error probability expression. The optimal DMT is a 
piecewise linear function. We get that each line corresponds to a sequence of lattices with certain number of dimensions per 
channel use. 

Next we wish to give the reasoning for the average number of dimensions per channel use required to achieve each line of 
the optimal DMT. For simplicity let us assume the case M — N. First let us consider the straight line that coincides with the 
optimal DMT in the range < r < 1. In this case the average number of dimensions per channel use required to attain this 
straight line equals jy+M^-i ~ 2M-i ■ assume T — N + M — 1 = 2M — 1, i.e. the sequence of lattices is A/^-complex 

dimensional. The sequence of lattices lies within an A/T-complex dimensional space. In this space each singular value of the 
channel matrix H occurs T times. Assume each complex dimension of the sequence of lattices is transmitted on a certain 
singular value. Let us denote Ti as the number of dimensions transmitted on the singular value p^"', where in our example 
X]f=i = l^ge P the channel singular values PDF is of the form p^ IDi=i(2«-i)ai^ where ai > ■ ■ ■ > um > 0. In 

order to attain the optimal DMT in the range < r < 1 when transmitting an Af ^-complex dimensional sequence of lattices, 
the allocation of dimensions needs to fulfil the following condition 

j 

51^. 1<J<M, (59) 

i=l 

i.e. each singular value can not occur in more dimensions than the affect it has on the PDF of the singular values. For instance 
when transmitting Af^-complex dimensional lattice p^"^ can not be assigned with more than a single dimension, p^'^^ and 
Q^jj jjoi" assigned together with more than four dimensions et cetera. Hence, in a sense, when transmitting Af^-complex 
dimensional sequence the allocation of dimensions needs to match the channel. Since the strongest singular value p^"'^' has 
affect of p-(2M-i)aM tjjg pDp^ choosing T = 2M — 1 is the minimal amount of channel uses that enable to fulfill (|59] l 
when transmitting A/^-complex dimensional sequence of lattices. Note that by adding a single dimension and transmitting 
(A/^ + l) -complex dimensional sequence of lattices, while maintaining T = 2Af — 1 (if possible), ( |59] l can not be satisfied. 
In general, as shown in Theorem |2] any choice of if > will lead to suboptimal DMT in the range < r < 1. 

As for the straight lines in the range I < r < I + 1, where ^ = 1, Af — 1. In this case note that 

rfkA'W=^(A/-0(A^-0(^-0 l<r<l + l, (60) 

i.e. the optimal DMT in the range Z < r < ^ + 1 equals to the optimal DMT of a channel with M — I transmit and A'^ — I receive 
antennas, shifted by I. Again let us consider the case Af = iV. In this case the optimal number of dimensions per channel 
use equals + I = 2(m-i)-i + ^- Let us consider (A/ — l)"^ + I ■ {2 {AI — I) — 1) complex dimensional sequence 

of lattices with T — 2 {M ~ I) — 1. In this case in order to attain the optimal DMT the allocation of dimensions needs to 
fulfil that I ■ (2 (Af — I) — 1) complex dimensions are transmitted on the first I strongest singular values . . . , p^'^^-i+i^ 

where the rest of the {AI — i f' dimensions are assigned in accordance with the PDF of the singular values , . . • , p^""-' . 
When r > Z it can be viewed as if the multiplexing gain "deducts" the affect of I (2 {AI — Z) — 1) dimensions on the error 
probability. Hence, effectively we get an {AI — Z)^-complex dimensional sequence of lattices with multiplexing gain r — I, 
where its dimensions match ( |59] l a channel with M — I transmit and receive antennas. This leads to the same behavior as the 
first line of d*^j_^^2 {■)■ 

C. The Relation Between Multiplexing Gain of IC's and Finite Constellations 

In this paper we defined the multiplexing gain of IC's sequence as the rate the IC's density increases (|4]i, i.e. when 7ti — p"^^ 
the multiplexing gain is r. We characterized the optimal DMT of IC's based on this definition of the multiplexing gain. In 
practice a finite constellation is transmitted, even when performing regular lattice decoding in the receiver. Hence, in this 
subsection we show that finite constellation with multiplexing gain r can be carved from a lattice with multiplexing gain r 
(according to the definition given in (|4]l), while maintaining the same performance when performing regular lattice decoding 
in the receiver. 

Consider a lattice A with density 7ti — p^^ ■ In this case for each lattice point the Voronoi region volume equals 

\V{x)\ = \V\=^^,'^p-^'^ Va;eA. 
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In II20I it has been shown that for any Jordan measurable bounded set S with volume \V (S) \ there exists a translate u such 
that 

\{A + u)nS\>^-^^ (61) 

where A + u is the translate of each lattice point by the constant u, and | (A + u) n 5*1 is the number of words of the translated 
lattice within the region S. Hence, for each lattice in a sequence with multiplexing gain r, there exists a translate such that the 
number of codewords within a sphere with volume 1 is larger or equal to p^^, i.e. the rate is r log (p) where in this setting p 
takes the role of SNR. Hence, it is possible to carve from the translated lattices sequence a finite constellations sequence with 
multiplexing gain r according to the definitions of finite constellations. When performing regular lattice decoding the translate 
does not affect the performance. Hence, the results we presented in this work also apply when carving finite constellations 
with the corresponding multiplexing gain from the lattices sequence, and performing regular lattice decoding in the receiver. 

VI. Conclusion 

In this work we introduced the fundamental limits of IC's/lattices in MIMO fading channels. We believe that this work can 
set a framework for designing lattices for MIMO channels using lattice decoding. 

Appendix A 
Proof of Theorem[T] 

We prove the result for any IC with density jrc- The proof outline is as follows. We prove the theorem by contradiction. 
First, for a given IC with receiver density jrc, we assume an average decoding error probability that equals to the lower bound 
we wish to prove. Then, we derive a "regular" IC from the given IC with the same density and the same average decoding 
error probability. Regularizing the IC allows us to find a lower bound on the IC maximal error probability that depends on 
its density. We expurgate half of the codewords with the largest error probability and get another regular IC with density 
Based on the average decoding error probability, we upper bound the expurgated IC maximal error probability, and based on 
its density we lower bound the same maximal error probability, and get a contradiction. 

Let us consider a KT-complex dimensional IC in the receiver, Sj^rp[p), with receiver density and average decoding 
error probability 

-KiH, p) = (1 - e*)£(Se-^--^(^^)+(^^-i) (62) 

where A(XT) = . r(i^T + 1) w, C(Xr) = ( )^ and < e^,e, < 1. 

Next we construct a regularized IC, Sj^rp[p), from Sj^rp[p), whose Voronoi regions are bounded and have finite volumes , i.e. 
there exists a finite radius r such that V{x) C Ball{x, r), Vx € Sj^rp{p), where Ball{x, r) is a i^T-complex dimensional ball 
centered around x. We construct Sj^j,{p) in the following manner. Let us define Co(p, H) — {Sj^j.[p) {^{Hex ■ cubeKrib))}, 
i.e. a finite constellation derived from Sj^rp{p). We turn this finite constellation into an IC by tiling Co{p, H) in the following 
manner 

S'kAp) = Co{p, H) + {b + b)H,xI?'^^ (63) 

where for simplicity we assumed that cubeKT{b) C C^^, i.e. contained within the first KT complex dimensions. Correspond- 
ingly, under this assumption, H^x equals the first KT complex columns of Hex- In this case, the tiling of Co{p,H) is done 
according to the complex integer combinations of Hex columns. In general, cubeKrib) may be a rotated cube within C*^-'^. 
In this case the tiling is done according to some KT complex linearly independent vectors, consisting of linear combinations 
of Hex columns. An alternative way to construct S^xip) is by considering the transmitter IC Skt{p)- In this case we can 
construct another IC in the transmitter 

Skt{p) = {SiMp) fl cubeKTib)} + (b + b)!?""^ (64) 

where without loss of generality we assumed again that cubeKrib) G C^'^ . In this case SKrip) — {Hex ■ <5'kt(p)}- 

Next we would like to set b and b to be large enough such that S^rpip) has average decoding error probability smaller or 
equal to ^^l^£D.g- tJ-<:<:- Mkt)+{kt -l)\r^{^lra) density larger or equal to ^rc- Due to the symmetry that results from the tiling 
(|63] l, it is sufficient to upper bound the average decoding error probability of the points x e Co(p, H) C Sj^rp{p) denoted by 

s" " s 

Pe (Co) in order to upper bound the average decoding error probability of the entire IC SKxip) ■ Hence Pe (Co) is also 

the average decoding error probability for the IC S'^y(p). We can upper bound the error probability in the following manner 

Pe'''^iCo) < Pe(Co) + PeiS'KT \ Co) (65) 

where Pe(Co) is the average decoding error probability of the finite constellation Coip,H) and PeiSKx \ Co) is the average 
decoding error probability to points in the set {Sj^j. \ Co{p, h)}, i.e. the error probability inflicted by the replicated codewords 
outside the set Coip,H). 
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We begin by upper bounding P^iSj^rp \ Cq) by choosing b to be large enough. By the tiUng at the transmitter (|64l and 
the fact that we have finite complex dimension KT, for a certain channel realization H^x we get that there exists S{Hex) 
such that any pair of points xi G Co{p,H), X2 G {Sj^j. \ Co{p,h)} fulfils \\x-^ — Xjll > 2& • 5{Hex)- The term 6{Hex) is 
a factor that defines the minimal distance between these 2 sets for a given channel realization. Note that also for the case 
M > N, there must exist such 8{Hex), as we assumed that Sj^j,{p) is i^T-complex dimensional IC, i.e. the projected IC 
Sj^rp[p) — HexS kt{p) is also A'T-complex dimensional. Hence, we get that 



PeiS^^XCo) < Pr{\\n^J,>b 5{H,x)) 
where n^^ is the effective noise in the i^TT-complex dimensional hyperplane where Sj^rp[p) resides. By using the upper bounds 

2KT 

(b'5(H (b'S{Hex)fe i^j. 



from Ids], we get that for "^^^g"^^ > 



Hence, for b large enough we get that 

Pe{S';,r \ Co) < (1 - 

Now we would like to upper bound the error probability, Pe(Co), of the finite constellation Co{p,H). According to the 
definition of the average decoding error probability in (|8}, the definition of Co{p, H) and the assumption in ( l62b . we get that 

where limb-^oo£{b) = 0. It results from the fact that in (|8) we take the limit supremum, and so for b large enough the average 
decoding error probability of the IC must be upper bounded by the aforementioned term. Also, for any b the average decoding 
error probability of the finite constellation Co{p,H) is smaller or equal to the error probability, defined in dS], of decoding 
over the entire IC. Based on the upper bound from (l65t we get the following upper bound on the error probability of Sj^rp(p) 

pf^^Co) < (i-^')a+^W) c(j^r)e-^"-^(^^) • pi^^-'K (66) 

According to the definition of jrc and due to the fact that we are taking limit supremum: for any < ei < 1 there exists b 
large enough such that 

, (67) 
vol[Hex ■ cubeKT{b)) 

where |Co(/9, is the number of points in Co{p,H). In fact there exists large enough b that fulfils both (1661 ) and ( l67b . 

In ( l63T l we tiled by 6 + 6 . If we had tiled Cq{p, H) only by &, then for large enough b we would have got IC with density 
larger or equal to (1 — ei)7rc- However , as we tile by 6 + 6 , we get for b large enough that Sj^rp[p) has density greater or 
equal to ^—^-^rc- Hence, for any < £2 < 1 there exists b large enough such that 

lrc>{l-ei){l-e2)lrc. (68) 



where 7^^ is the density of S^rp[p). Again, there also must exist large enough b that fulfils ( |66] | and ( I68I I simultaneously. 
Hence, for large enough b we can derive from Sj^j,{p) an IC Sj^rp{p) with density 7^^, > (1 — ~ e2)7rc and average 
decoding error probabiHty smaller or equal to (i:;lll^±lMc(A'r)e-''--^(^'^)+('f^^-i) 

By averaging arguments we know that expurgating the worst half of the codewords in Sj^rp{p), yields an IC Sj^rp[p) with 
density 

7^c> (l-ei)(l-e2)^ (69) 

and maximal decoding error probability 



s^PxGS';, 



Pe ^^(x) < (1 - £*)(! + e(6))C(ifr)e-^"-^(^^Vr^^~' (70) 



where Pe ^^(x) is the error probability of a; G Sj^j,{p). 

From the construction method of Sj^rp^p), defined in ( |63] |. it can be easily shown that tiling Co{p,H) yields bounded 
and finite volume Voronoi regions, i.e. there exists a finite radius r such that V{x) C Ball{x,r), Vx G Sj^rp^p). Due to the 
symmetry that results from Sj^rp{p) construction ( |63] l, it also applies for Sj^rp{p). Hence, there must exist a point xq € Sj^rp{p) 
that satisfies |V^(a;o)| < -777- < =■ According to the definition of the effective radius in ([T), we get that roff(a;o) < 7'cff(7rc)- 
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Hence, we get 



where the lower bound Pe {xq) > ^'T'dl^^exll ^ ^cff(2^o)) was proven in lfT3l . We calculate the following lower bound 



By assigning r^g = ( we g 



et 



ITT . 



swp^gs-^Pe -"(x) > C{KT) ■ e-t^^^«^^+^^^-^^'"^t#;^^ (73) 
Hence, for certain ei and £2 we get 



where /i^c = "^j^- For 6 large enough we get (1 — e*)(l + e(&)) < 1, and so (|741 ) contradicts (fTol i. As a result we get contradic- 
tion of the initial assumption in This contradiction also holds for any p) < il:^£}^iED. g- t^.c- AiKT)+iKT -1) inif,,,) ^ 
Hence, we get that 

P:{H,p) > (75) 

Note that the lower bound holds for any < ei,e2,e* < 1 and also that the expressions in ( |62] |. dTSl l are continuous. As a 
result we can also set ei = £2 = e* = and get the desired lower bound. Finally, note that we are interested in a lower bound 
on the error probability of any IC for a given channel realization. Hence, we are free to choose different values for b and b 
for each channel realization, and b . 



Appendix B 

Proof of the optimization problem in TheoremE] 

We would like to solve the optimization problem in (fTSl l for any value of K = B+ < L, where B E N and < /3 < 1. First 
we consider the case of < i^T < 1, i.e. the case where i? = 0. In this case the constraint boils down to = 1— By assigning 
ai = • • • = = 1 — we get that dxrir) < MN{1 — ■^). Next we analyze the case where K > 1. Due to the constraint, 
the minimal value must satisfy ai = ■ ■ ■ = ul-b- From the constraint we also know that — ^^^^^=1 OiL-i — (^chl-b- 
By assigning in ( fTsT i we get 

B-l 

min(i^ - r){N + M - 1) + ({M - B){N - B) - P{N + M - l))aL-B - V 2i • aL-^ (76) 

~ 1=1 

where a > signifies ai > • • • > > 0. We would like to consider two cases. The case where ((Af — B){N — B) — 
P{N + M - 1)) > J2fji 2j and the case where ((M - B){N - B) - I3{N + M - 1)) < Y.fji 2i- The first case, where 
((Af - B){N - B) - I3{N + M - 1)) > B{B - 1), is achieved for K < 77^7^. In this case we use the following Lemma 
in order to find the optimal solution 

Lemma 3. Consider the optimization problem 

D 

minPiCi — 7 BiCi 

c ^ — ^ 

i=2 

where: (1). ci > • • • > cd > 0; (2). Bi > and B2 > ■ ■ ■ > Bd > 0; (3). Ci ^ 5 > 0, where < /3 < 1. 

The minimal value is achieved for ci ~ ■ ■ ■ ~ cjj = 

Proof: We prove by induction. First let us consider the case where D = 2. In this case we would like to find 

minBici — B2C2- (77) 

c 

where Ci > C2 > 0, (3ci + C2 = (5 > 0, Pi > i?2 > and < /3 < 1. It is easy to see that for this case the minimum is 
achieved for ci = C2, as increasing ci while decreasing C2 to satisfy /3ci + C2 = (5 will only increase ( |77] i. 
Now let assume that for D elements, the minimum is achieved for ci — ■■■ — cd — ttzttt!- consider P + 1 elements 
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with constraint /3ci + X^iLV ~ ^- ^^^^ ci — ■ ■ ■ = cd+i — -pq-g we get 

D + l „ 

(^-E^O^. (78) 



We would like to show that this is the minimal possible value for this problem. Take Cj^^^ = — e > 0. In this case 



/3ci + X]iL2 "^i ^ (-P i+;3)^+(D+ff)e order to satisfy /^c^ + X^iLV = According to our assumption BiCi — X]t=2 ^i^i 



is minimal for Cj^ = • • • = = -jj^-^ + . By assigning these values we get 



D+l ^ D 



i=2 i=2 

which is greater than (|78^ . This concludes the proof. ■ 
For the case ((A/ — B){N — B) — P{N + M — 1)) > B{B — 1), the optimization problem coincides with Lemma |3] 
as it fulfils the condition Bi > X]t=2 ™ lemma. Hence, the optimization problem solution for K < is 
ai = • • • = ol-i = ^^kZi^ = Q^- The minimum is achieved when ul = a, i.e. the maximal value can receive under 
the constraint ai > • • • > > 0. We get that a = 1 — and the optimization problem solution of (fTST i for the case 
K < is dKT{r) < MN{1 - j^l . 

For the case ((A/ - B){N - B) - l3{N + M - 1)) < B{B - 1), or equivalently K > j^^^, we would like to show that 
the optimal solution must fulfil = 0. It results from the fact that for the optimal solution, the term ((Af — B){N ~ B) — 
f5{N + M — l))aL__B — 2i • aL-i in ( |76] ) must be negative. This is due to the fact that taking ai — ■ ■ ■ ~ a^-i gives 

negative value. Hence, for the optimal solution we would like to maximize X^t^i^ oiL~i ~ Poil^b = K — r — a^. By taking 
Q!L = the sum is maximized. Hence, the optimal solution for K > i must have = 0. 



Now consider the general case. Assume that for K > ^j^^ ^'^^ (i-i) + ^ ^ 1 '^^e optimal solution must have = ■ ■ ■ = 



- M+T 
-l+l) 



QfL-z+i = 0. First consider the case where 1 < I < B ~ 1. For this case the constraint is X)^/ ^ ctL-i + l^ctL-B = K — r, i.e. 
the constraint contains at least two singular values. We can rewrite (flSt as follows 

inm{K - r){N + M ~ 1 - 2 ■ I) + {{M - B){N - B) - P{N + M-1-2- l))aL-B - 2(i - /) • aL-t- (79) 

~ i=l+l 

For the case ((Af - B){N ~ B) - P{N + M - 1 - 2 ■ I)) > {B - 1 - 1){B - I) we get that K < ^^l^^^^I^-i + ^ ^e also 
assumed that K > ''pf^Ml\^^2^^-i) + ^ ^ 1- For this case we can use Lemma |3] and get that the optimization problem solution 
is aL-i-i = ■ ■ ■ = a^^B = — j^'Jj"^'' — a. The minimum is achieved for q;l_; — a. We get that = • • • = ctL-i+i = 
and ai = • • • = a^^i = -^Ej. Hence, for the case '"^^^j^ \^^2(i~i) + I — ^ ^ K < ^^Af-i^^2-^i ' '■^^ solution is 
dKT{r) < {N-l){M-l)^. 

For the case {{M - B){N - B) ~ (3{N + M - I - 2 ■ I)) < {B ~ 1 - 1){B - I), or equivalently K > jj'^'pif J^'^^ + I, the 

term ((Af - B){N - B) - l3{N + Af - 1 - 2 • l))aL-B - J2f=ili 2(« - ' "i-* i" •ESl must be negative for the optimal 
solution. This is due to the fact that by taking ai = ■ ■ ■ = a^^i^i we get a negative value. Hence we would like to maximize 
the sum oiL-i + Pul-b — K — r — ah-i- The sum is maximized by taking ul-i = 0. Hence the optimal solution 

for the case K > + 1 must have ul-i — ■ ■ ■ = aL = 0- Note that for the case Z = i? — 1 we have only two terms 

in the constraint ul-b+i + Pchl-b = K — r. However, the solution remains the same. 

For the case K > ^pJ^m-i'^2{i-i) + ^ ^ 1 ^"d I ~ B the constraint is of the form a^-B = "^Ef- Again we assume that 
a^^B+i = • • • = = 0. In this case the solution is ai = • • • = aj^^i = and so dKrif) < — 1){N — I) ^ZJi - This 
concludes the proof. 



Appendix C 
Proof of Lemma[T] 

We begin by proving the case N > M . From the construction of G; it can be seen that a set of columns {/i^, . . . ,h^} may 
occur in N — i + j blocks at most. It results from the fact that we can only subtract M — i columns to the right of ( fTSl ), 
and j ~ I columns to the left of hj (fT9T l. and still get a block that contains {hj, ■ ■ ■ (or even more specifically a block 
that contains {hj,h^}). In addition, columns {hj, . . . , /ij} must occur in the first N — AI + 1 blocks, as these blocks equal to 
H Hence, we can upper bound the number of occurrences by N — M + 1+ ] — \ + M — i = N — i + j. 

Next we prove the case Af > N . When < i ~ j < N, the set of columns {hp ■ ■ ■ ,hi} may occur in N — i + j blocks at 
most. We divide the proof into four cases. 
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1) i < N and j > M — N + 1. In this case the set of columns {h^, . . . , /ij} occurs in the first AI — iV + 1 blocks 
( |20| |. As for the additional N — 1 — I pairs of columns, the set of columns belongs both to the set {hi, . . . ,hj^j} and 
{IIm-n+ij ■ ■ ■ jiiA/}- Hence, in the additional column pairs we can subtract N — i columns to the right of (l2Tl l and 
j — M + N —1 columns to the left of hj ( |22] |. Added together we observe that the number of occurrences can not exceed 
N -i+j. 

2) i < N and j < Af — + 1. In this case the set of columns can have only j occurrences in the first AI — N + 1 blocks. 
In this case the set {hj, . . . ,h^} occurs within {hi, . . . ,hj^} but does not occur within {/i^vz-tv+ij • ■ • iILm}- Hence, the 
transmission scheme only subtracts columns to the right of h^ (l2Tl i. In this case we can have N — i subtractions and 
together we get N — i + j occurrences at most. 

3) i > N and j > AI — N + 1. We have here AI — i + 1 occurrences in the first M — A^ + 1 blocks. In this case the set 
{hj, . . . ,h^} occurs within {[im-n+i^ ■ ■ ■ jILm} but does not occur within {hi, . . . ,h]^}- Hence we can subtract up to 
j ~ AI + N — 1 columns to the left of h^ (l22l) . Together there are N — i + j occurrences at most. 

4) Last case, i > N and j < M — A^ + 1. Here the set of columns can only occur in the first AI — N + 1 blocks. In this 
case there are exactly N — i + j occurrences in the first M — A^ + 1 blocks. 

In case i — j > N, the set of columns does not occur in any block as each column of G/ does not have more than A^ non-zero 
entries. 



Appendix D 
Proof of Lemma[2] 

We know that 



min 



where 



s=0 



o-i^j ^ ■) — < k < mmlj, L) — 1 

-1 se{fe+i,...,Ar} 



and by definition 

In order to prove the Lemma we begin with a(min(j, V) — 1, ). We know that 



a(min(j,L)-l,ep > • • • > a(0, > 0. 



AT 



V >{N - min(j, L) + 1) • min C,,,- (80) 

— ' s 

s— inin(j,L) 

where s E {min(j, L), . . . , N}. We can also see that 



Ck+i.j > , min (81) 
se{fe+i.....Af} 



for < fc < min(j, L) — 2. Hence we get 



JV 



This concludes the proof. 



Appendix E 
Proof of Theorem[5] 

We prove that there exists a sequence of 2 AT; T; -real dimensional lattices (as a function of p) that attains the same diversity 
order as in Theorem 2] By using the Minkowski-Hlawaka-Siegel Theorem |fT3l . |fT9l . we upper bound the error probability of 
the ensemble of lattices, for each channel realization. This upper bound equals to the upper bound derived in Theorem |3] Then 
we average the upper bound over all channel realizations, and receive the desired diversity order 

We consider a 2KiTi-ieal dimensional ensemble of lattices, transmitted using the transmission scheme defined in subsection 
IIV-AI We spread the first KiTi dimensions of the lattice on the real part of the non-zero entries of G/, and the other A'/T/ 
dimensions of the lattice on the imaginary part of the non-zero entries of G/. Each lattice in the ensemble has transmitter 
density jtr = p'^^', i-e. multiplexing gain r. We begin by analyzing the performance of the ensemble of lattices in the receiver, 
for each channel realization. We assume a certain channel realization that induces a receiver VNR /i^c = p ^' ^i=i k^t^ ^ 
where ?7 > 0. For each lattice in the ensemble we get that the channel realization induces a new lattice in the receiver, ^fg^-y • x, 
with density in accordance with (O and subsection IIV-BI For lattices with regular lattice decoding, the error probability 
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is equal among all codewords. Hence, it is sufficient to analyze the lattice's zero codeword error probability. We define the 
indication function 

r . , r 1, ||x|| < 2i? 

In a similar manner to (l24l we can state that for each lattice induced in the receiver, Arc, the lattice zero codeword error 
probability is upper bounded by 

^BaH(0,2i?,„«)fe)-^'''(ll^cxll>ll£-^cxll)+-P?'(llBcxll >^off) (82) 

where 2KiT\cr2 ~ ^^rc, and n^^ is the effective noise in the XiT; -complex hyperplane where Arc resides in. By defining 
frc{x) = -^Sa/;(o,2i?oft)(£) ' ^''(IIb^oxII > lls ^ Gioxll)' "^c Can rewrite the upper bound on the error probability from ( |82| | 

E U2L)+Pr{\\n.J>Ros). (83) 

Note that 

Ire I frcix)dx + Pri\\n^J>R,s) (84) 

is equal to the expression in (|27| i, where j^c is the density of the lattice induced in the receiver Aj-c, as defined above. 

We need to show that there exists a single probability measure for all channel realizations, that gives an average decoding 
error probability over the ensemble, which is upper bounded by ( l84l i. Hence, we consider the ensemble of lattices in the 
transmitter which is fixed for each channel realization. For this reason we define 

y^. = {H^'-H^irH^i'-y^.- (85) 

Note that the operation in dSST l does not change the error probability of the lattice when we use regular lattice decoding. Each 
lattice in the ensemble has density jt^ ~ p*"^'. Now we define the following indication function 

r ^ ^ _ / 1. \\H-x\\< 2R 

^elHpseiH,2R){2L) - <^ 0, dse 

that is the function is one if x is within the ellipse and zero otherwise. Let us denote the error probability of a lattice in the 
ensemble for certain channel realization 77 by Pe'^\rj, p), where is a random variable that represents a certain lattice in the 
ensemble. Using regular lattice decoding, we get the following upper bound on the error probability for each lattice codeword 

Pi''\ri^p)< We(H<;^2fl.,0fe)-^KP-^cxll>P-fe-^cx)ll)+^KP-^cxll >^cff) (86) 

where yl is a KiTixKiTi matrix that satisfies A ~ H^g^ H^^g, Atr is the lattice from the ensemble that corresponds to u and 
n^^ ~ CA^(0, {H^g^ H^g)^^y Note that ( l86l ) is equal to ( |83] ), and the corresponding terms in the expressions are also equal. 
Let us define grc{x) = I^ipseiH^,] .2R,ii)^^) ' -P^dl^BcxIl >P(£ - icx)ll)- We get that 

7tr / ffrcfe)rf£ = 7rc / fi-cix)dx. (87) 

Next we show that by averaging the upper bound in (l86l l over the ensemble of lattices in the transmitter, with the correct 
probability measure, we get 

EAP^^^V.p)} < ^rc I frc{x)dx + Pr{\\n^J^ > iieff). (88) 

We prove dSST l by using the Minkowski-Hlawaka-Siegel theorem ||T3l: 

Theorem 6. (Minkowski-Hlawaka-Siegel Theorem) In the set of all the lattices of density 7 in R^^'-^', there exists a probability 
measure v such that for any Riemann integrable function f{x) which vanishes outside some bounded region we have 

E.{Y9(x)}=-f [ g{x)dx (89) 

where E^{-} represents the expectation with respect to the measure v. 

Note that considering a 2i^;r;-real dimensional lattices enables us to use this theorem. Hence, by choosing 7 = 7tr, 
g{x) = gic{x), and considering (l86T l. dSTT i we get the desired upper bound (1881 ). As a result, we can upper bound the ensemble 
average decoding error probability for each channel realization by the upper bound from Theorem! 
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Now we are ready to lower bound the diversity order According to Theorem |6] there exists a single probability measure 
that satisfies (|89] l, for any Riemann integrable function that vanishes outside some bounded region. Based on ( |47] i and Lemma 
121 we get for the set \ J2iLi J2jLi Ci.j ^ Ti{Ki — > 0} a set of functions, g-[c{x), which are bounded. As a result 

we can upper bound the ensemble average decoding error probability for this set by the expression from (l34l . For the set of 
events X^iLi Y^^jLi > Ti{Ki — r);£^ij > 0} we upper bound the ensemble average decoding error probability by 1. 
This bounds are the exact same bounds we used in order to average over the channel realizations in Theorem |4] Hence, by 
averaging over the channel realizations we get for the ensemble the same lower bound on the diversity order as in Theorem 
|4] This concludes the proof. 

Appendix F 
Proof of Corollary[5] 

The proof of this corollary relies heavily on Theorem |3] We begin by describing the L ensembles of IC's and how they 
are transmitted. Then we use averaging arguments in order to show that there exists a singe sequence of IC's that attains the 
optimal DMT. 

We begin by considering a sequence of iiToTo-complex dimensional IC's with multiplexing gain r = 0, i.e. the transmitter 
density — 1 for any p. In a similar manner to Theorem |3] we first consider an ensemble of finite constellations drawn 
uniformly within cnheKoToib) C C^""^". Each code-book contains I'^tib'^^"'^'' \ = [b'^^"^°\ points, where each point is drawn 
uniformly within cubeKoTg (b). Let us denote a certain finite constellation in the ensemble by Cfc{p, KqTq, b) C cubeK^To (b)- 
We extend each finite constellation in the ensemble into an IC in a similar manner to (|32| 

IC{p, KoTo) = Cpcip, KoTo, b) + {b + b ) ■ (90) 

By choosing b = \J P ° +'^'^ and b ~ \J P ° ° ^'^^ we get a sequence of ensembles of IC's with multiplexing gain 
r = 0. For a certain channel reahzation 77 > we get in accordance with Theorem |3] 

P'eip, V, KoTo) < D{KoTo)p-^"''"+^^"^"" (91) 

where Pe{p,r], KqTq) is the average decoding error probability of the iiToTo -complex dimensional ensemble of IC's. From 
Theorem|4]we know that by transmitting the ensemble of IC's over the transmission matrix Gq, and averaging over the channel 
realizations, we get diversity order (Ikq = MN. Transmitting over Go gives us a To -complex dimensional ensemble of IC's 
within C*^'^". 

Next we derive from the XoTo-complex dimensional ensemble of IC's, another XiT; -complex dimensional ensemble of 
IC's, where ^ = 1, . . . , L - 1. For each IC, IC{p, KoTo), in the ensemble we take the first [6^^'^' J points in CpciPi KoTo, b). 
We take the components of these points inside cubeic,T, {b), and denote this new finite constellation as Cpcip, KiTi, b). Then 
we replicate these points in a similar manner to ( |90t . In this case we get a new /^iTj-complex dimensional IC 

ICip, KiTi) = Cpcip, KiTub) + {b + b) ■ Z^^'^'. (92) 

By doing it to each IC in the ensemble, we get a new i^TfT; -complex dimensional ensemble of IC's. This new ensemble is 
equivalent to ensemble of IC's generated by drawing uniformly [5^^'^' J points inside cnheKiTiib), and then replicate these 
points according to (6 + 6 )I?^''^' . Each IC sequence in this ensemble has multiplexing gain r = 0. Since 6 > \J^^P^^^'^'^ 
and 6 > \/ -^^P^" > we get in accordance with Theorem |3J that for a certain channel realization ry > 

77, KiTi) < D{KiTi)p-^''''+^^=^' (93) 

where Pe{p,ri, KiTi) is the average decoding error probability of the A';T/-complex dimensional ensemble of IC's. By 
transmitting this ensemble of IC's on the transmission matrix Gi, and averaging over the channel realizations, we get diversity 
order = (A/ — 1){N — I) + 1{N + A/ — 2 • Z — 1). Transmitting over Gi gives us a i^jT; -complex dimensional ensemble 
of IC's within C*^^' . 

From the sequential structure of the transmission scheme we get that omitting the 2 • I rightmost columns of Go yields G/. 
Hence we can derive from the ii^oTo-complex dimensional ensemble of IC's, that attains diversity order dKo, another KiTi- 
complex dimensional ensemble of IC's the attains diversity order cIki, where 1 = 1,..., L — 1. We attain it by diluting the points 
of each Xo^o-complex dimensional IC in the ensemble in the aforementioned manner, and then reducing its dimensionality 
by dropping the 2 • / rightmost columns of Gq. 

So far we have shown the connection between the ensembles. Now we would like to show that there exists a certain 
sequence of JCoTo-complex dimensional IC's, that gives us the desired diversity orders by diluting its points and adapting 
its dimensionality. We denote the average decoding error probability of the if; T; -complex dimensional ensemble of IC's by 
Ai{p)p^''''^i , where limp^oo '°^o^(p)'"''' ^ 0- ^1^° define // ^ as the event where a XiT; -complex dimensional IC in the 
ensemble has average decoding error probability which is smaller or equal to {L + l)Ai{p)p^'^'^i , where I = 0, . . . , L — 1. From 
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averaging arguments we know that Pr{Ii^p) > We wish to show that the probabiUty of the event {/o.pH/i.p • • -n/i-i p} 
is bounded away from zero. From averaging arguments we know that 

L-l ^ 

Pr{Io,p n /i,p • • • n Il-i,p) > 1 - ^ Pr{h,p) > j—- 

1=0 

Hence there must exist a sequence of /Co To -complex dimensional IC's that attains diversity order dK„ and has multiplexing gain 
r — 0, from which we can derive for each I — 1, . . . , L — 1, a sequence of A'iTi-complex dimensional IC's with multiplexing 
gain r — and diversity order d^c, . 

Next we show that these L sequences attain the optimal DMT. Consider a sequence of i^TfT; -complex dimensional IC's, that 
has multiplexing gain r = and attains diversity order d/^, . From Corollary |3] we know that scaUng this sequence by a scalar 
P^2k7 yields a new sequence of IC's with multiplexing gain r and diversity order 

dK, (r) = (M ~ 1){N - I) - {r - 1){N + M - 2 ■ I - 1) 

where < r < Ki and I = 0, . . . , L — 1. Each of the L straight lines {r), I = 0, . . . , L — 1, coincides with a different 
segment out of the L segments of the optimal DMT. This concludes the proof. 
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